Skip to content

Commit 4801843

Browse files
authored
fix README.md link (#397)
1 parent 62abeb4 commit 4801843

File tree

2 files changed

+5
-5
lines changed

2 files changed

+5
-5
lines changed

README.md

+4-4
Original file line numberDiff line numberDiff line change
@@ -39,20 +39,20 @@ Easy, advanced inference platform for large language models on Kubernetes
3939
## Key Features
4040

4141
- **Easy of Use**: People can quick deploy a LLM service with minimal configurations.
42-
- **Broad Backends Support**: llmaz supports a wide range of advanced inference backends for different scenarios, like [vLLM](https://github.com/vllm-project/vllm), [Text-Generation-Inference](https://github.com/huggingface/text-generation-inference), [SGLang](https://github.com/sgl-project/sglang), [llama.cpp](https://github.com/ggerganov/llama.cpp). Find the full list of supported backends [here](./docs/support-backends.md).
42+
- **Broad Backends Support**: llmaz supports a wide range of advanced inference backends for different scenarios, like [vLLM](https://github.com/vllm-project/vllm), [Text-Generation-Inference](https://github.com/huggingface/text-generation-inference), [SGLang](https://github.com/sgl-project/sglang), [llama.cpp](https://github.com/ggerganov/llama.cpp). Find the full list of supported backends [here](./site/content/en/docs/integrations/support-backends.md).
4343
- **Accelerator Fungibility**: llmaz supports serving the same LLM with various accelerators to optimize cost and performance.
4444
- **Various Model Providers**: llmaz supports a wide range of model providers, such as [HuggingFace](https://huggingface.co/), [ModelScope](https://www.modelscope.cn), ObjectStores. llmaz will automatically handle the model loading, requiring no effort from users.
4545
- **Multi-Host Support**: llmaz supports both single-host and multi-host scenarios with [LWS](https://github.com/kubernetes-sigs/lws) from day 0.
4646
- **AI Gateway Support**: Offering capabilities like token-based rate limiting, model routing with the integration of [Envoy AI Gateway](https://aigateway.envoyproxy.io/).
47-
- **Build-in ChatUI**: Out-of-the-box chatbot support with the integration of [Open WebUI](https://github.com/open-webui/open-webui), offering capacities like function call, RAG, web search and more, see configurations [here](./docs/open-webui.md).
47+
- **Build-in ChatUI**: Out-of-the-box chatbot support with the integration of [Open WebUI](https://github.com/open-webui/open-webui), offering capacities like function call, RAG, web search and more, see configurations [here](./site/content/en/docs/integrations/open-webui.md).
4848
- **Scaling Efficiency**: llmaz supports horizontal scaling with [HPA](./docs/examples/hpa/README.md) by default and will integrate with autoscaling components like [Cluster-Autoscaler](https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler) or [Karpenter](https://github.com/kubernetes-sigs/karpenter) for smart scaling across different clouds.
4949
- **Efficient Model Distribution (WIP)**: Out-of-the-box model cache system support with [Manta](https://github.com/InftyAI/Manta), still under development right now with architecture reframing.
5050

5151
## Quick Start
5252

5353
### Installation
5454

55-
Read the [Installation](./docs/installation.md) for guidance.
55+
Read the [Installation](./site/content/en/docs/installation.md) for guidance.
5656

5757
### Deploy
5858

@@ -126,7 +126,7 @@ curl http://localhost:8080/v1/completions \
126126

127127
### More than quick-start
128128

129-
Please refer to [examples](./docs/examples/README.md) for more tutorials or read [develop.md](./docs/develop.md) to learn more about the project.
129+
Please refer to [examples](./docs/examples/README.md) for more tutorials or read [develop.md](./site/content/en/docs/develop.md) to learn more about the project.
130130

131131
## Roadmap
132132

site/content/en/docs/installation.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ description: >
1212

1313
- Kubernetes version >= 1.26. LWS requires Kubernetes version **v1.26 or higher**. If you are using a lower Kubernetes version and most of your workloads rely on single-node inference, we may consider replacing LWS with a Deployment-based approach. This fallback plan would involve using Kubernetes Deployments to manage single-node inference workloads efficiently. See [#32](https://github.com/InftyAI/llmaz/issues/32) for more details and updates.
1414
- Helm 3, see [installation](https://helm.sh/docs/intro/install/).
15-
- Prometheus, see [installation](https://github.com/InftyAI/llmaz/tree/main/docs/prometheus-operator#install-the-prometheus-operator).
15+
- Prometheus, see [installation](https://github.com/InftyAI/llmaz/blob/main/site/content/en/docs/integrations/prometheus-operator.md#install-the-prometheus-operator).
1616

1717
Note: llmaz helm chart will by default install
1818
- [Envoy Gateway](https://github.com/envoyproxy/gateway) and [Envoy AI Gateway](https://github.com/envoyproxy/ai-gateway) as the frontier in the llmaz-system, if you *already installed these two components* or *want to deploy in other namespaces* , append `--set envoy-gateway.enabled=false --set envoy-ai-gateway.enabled=false` to the command below.

0 commit comments

Comments
 (0)