diff --git a/README.md b/README.md
index e2f8e439..ee99974c 100644
--- a/README.md
+++ b/README.md
@@ -32,13 +32,13 @@ Easy, advanced inference platform for large language models on Kubernetes
 
 - **Easy of Use**: People can quick deploy a LLM service with minimal configurations.
 - **Broad Backends Support**: llmaz supports a wide range of advanced inference backends for different scenarios, like [vLLM](https://github.com/vllm-project/vllm), [Text-Generation-Inference](https://github.com/huggingface/text-generation-inference), [SGLang](https://github.com/sgl-project/sglang), [llama.cpp](https://github.com/ggerganov/llama.cpp). Find the full list of supported backends [here](./docs/support-backends.md).
-- **Efficient Model Distribution (WIP)**: Out-of-the-box model cache system support with [Manta](https://github.com/InftyAI/Manta), still under development right now with architecture reframing.
 - **Accelerator Fungibility**: llmaz supports serving the same LLM with various accelerators to optimize cost and performance.
-- **SOTA Inference**: llmaz supports the latest cutting-edge researches like [Speculative Decoding](https://arxiv.org/abs/2211.17192) or [Splitwise](https://arxiv.org/abs/2311.18677)(WIP) to run on Kubernetes.
 - **Various Model Providers**: llmaz supports a wide range of model providers, such as [HuggingFace](https://huggingface.co/), [ModelScope](https://www.modelscope.cn), ObjectStores. llmaz will automatically handle the model loading, requiring no effort from users.
 - **Multi-Host Support**: llmaz supports both single-host and multi-host scenarios with [LWS](https://github.com/kubernetes-sigs/lws) from day 0.
+- **AI Gateway Support**: Offering capabilities like token-based rate limiting, model routing with the integration of [Envoy AI Gateway](https://aigateway.envoyproxy.io/).
+- **Build-in ChatUI**: Out-of-the-box chatbot support with the integration of [Open WebUI](https://github.com/open-webui/open-webui), offering capacities like function call, RAG, web search and more, see configurations [here](./docs/open-webui.md).
 - **Scaling Efficiency**: llmaz supports horizontal scaling with [HPA](./docs/examples/hpa/README.md) by default and will integrate with autoscaling components like [Cluster-Autoscaler](https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler) or [Karpenter](https://github.com/kubernetes-sigs/karpenter) for smart scaling across different clouds.
-- **Build-in ChatUI**: Out-of-the-box chatbot support with the integration of [Open WebUI](https://github.com/open-webui/open-webui), see configurations [here](./docs/open-webui.md).
+- **Efficient Model Distribution (WIP)**: Out-of-the-box model cache system support with [Manta](https://github.com/InftyAI/Manta), still under development right now with architecture reframing.
 
 ## Quick Start
 
@@ -51,7 +51,7 @@ Read the [Installation](./docs/installation.md) for guidance.
 Here's a toy example for deploying `facebook/opt-125m`, all you need to do
 is to apply a `Model` and a `Playground`.
 
-If you're running on CPUs, you can refer to [llama.cpp](/docs/examples/llamacpp/README.md), or more [examples](/docs/examples/README.md) here.
+If you're running on CPUs, you can refer to [llama.cpp](/docs/examples/llamacpp/README.md).
 
 > Note: if your model needs Huggingface token for weight downloads, please run `kubectl create secret generic modelhub-secret --from-literal=HF_TOKEN=<your token>` ahead.
 
@@ -118,14 +118,13 @@ curl http://localhost:8080/v1/completions \
 
 ### More than quick-start
 
-If you want to learn more about this project, please refer to [develop.md](./docs/develop.md).
+Please refer to [examples](./docs/examples/README.md) for more tutorials or read [develop.md](./docs/develop.md) to learn more about the project.
 
 ## Roadmap
 
-- Gateway support for traffic routing
-- Metrics support
 - Serverless support for cloud-agnostic users
-- CLI tool support
+- Prefill-Decode disaggregated serving
+- KV cache offload support
 - Model training, fine tuning in the long-term
 
 ## Community
diff --git a/chart/Chart.lock b/chart/Chart.lock
index a0da65ee..b4edee7f 100644
--- a/chart/Chart.lock
+++ b/chart/Chart.lock
@@ -2,5 +2,11 @@ dependencies:
 - name: open-webui
   repository: https://helm.openwebui.com/
   version: 6.4.0
-digest: sha256:2520f6e26f2e6fd3e51c5f7f940eef94217c125a9828b0f59decedbecddcdb29
-generated: "2025-04-21T00:50:06.532039+08:00"
+- name: gateway-helm
+  repository: oci://registry-1.docker.io/envoyproxy/
+  version: 0.0.0-latest
+- name: ai-gateway-helm
+  repository: oci://registry-1.docker.io/envoyproxy/
+  version: v0.0.0-latest
+digest: sha256:c7b1aa22097a6a1a6f4dd04beed3287ab8ef2ae1aec8a9a4ec7a71251be23e4c
+generated: "2025-04-22T20:15:43.343515+08:00"
diff --git a/chart/Chart.yaml b/chart/Chart.yaml
index f452fc8e..56eaad2e 100644
--- a/chart/Chart.yaml
+++ b/chart/Chart.yaml
@@ -25,11 +25,11 @@ dependencies:
     version: "6.4.0"
     repository: "https://helm.openwebui.com/"
     condition: open-webui.enabled
-  - name: envoy-gateway
-    version: v1.3.2
-    repository: oci://docker.io/envoyproxy/gateway-helm
+  - name: gateway-helm
+    version: 0.0.0-latest
+    repository: "oci://registry-1.docker.io/envoyproxy/"
     condition: envoy-gateway.enabled
-  - name: envoy-ai-gateway
-    version: v0.1.5
-    repository: oci://docker.io/envoyproxy/ai-gateway-helm
+  - name: ai-gateway-helm
+    version: v0.0.0-latest
+    repository: "oci://registry-1.docker.io/envoyproxy/"
     condition: envoy-ai-gateway.enabled
diff --git a/chart/values.global.yaml b/chart/values.global.yaml
index ad9de873..04f2f5e2 100644
--- a/chart/values.global.yaml
+++ b/chart/values.global.yaml
@@ -34,7 +34,7 @@ prometheus:
   enabled: true
 
 open-webui:
-  enabled: false
+  enabled: true
   persistence:
     enabled: false
   enableOpenaiApi: true
diff --git a/docs/envoy-ai-gateway.md b/docs/envoy-ai-gateway.md
new file mode 100644
index 00000000..69d6d920
--- /dev/null
+++ b/docs/envoy-ai-gateway.md
@@ -0,0 +1,106 @@
+# Envoy AI Gateway
+
+[Envoy AI Gateway](https://aigateway.envoyproxy.io/) is an open source project for using Envoy Gateway
+to handle request traffic from application clients to Generative AI services.
+
+## How to use
+
+### 1. Enable Envoy Gateway and Envoy AI Gateway
+
+Both of them are enabled by default in `values.global.yaml` and will be deployed in llmaz-system.
+
+```yaml
+envoy-gateway:
+    enabled: true
+envoy-ai-gateway:
+    enabled: true
+```
+
+However, [Envoy Gateway](https://gateway.envoyproxy.io/latest/install/install-helm/) and [Envoy AI Gateway](https://aigateway.envoyproxy.io/docs/getting-started/) can be deployed standalone in case you want to deploy them in other namespaces.
+
+### 2. Basic AI Gateway Example
+
+To expose your models via Envoy Gateway, you need to create a GatewayClass, Gateway, and AIGatewayRoute. The following example shows how to do this.
+
+We'll deploy two models `Qwen/Qwen2-0.5B-Instruct-GGUF` and `Qwen/Qwen2.5-Coder-0.5B-Instruct-GGUF` with llama.cpp (cpu only) and expose them via Envoy AI Gateway.
+
+The full example is [here](./examples/envoy-ai-gateway/basic.yaml), apply it.
+
+### 3. Check Envoy AI Gateway APIs
+
+If Open-WebUI is enabled, you can chat via the webui (recommended), see [documentation](./open-webui.md). Otherwise, following the steps below to test the Envoy AI Gateway APIs.
+
+I. Port-forwarding the `LoadBalancer` service in llmaz-system with port 8080.
+
+II. Query `http://localhost:8008/v1/models | jq .`, available models will be listed. Expected response will look like this:
+
+```json
+{
+  "data": [
+    {
+      "id": "qwen2-0.5b",
+      "created": 1745327294,
+      "object": "model",
+      "owned_by": "Envoy AI Gateway"
+    },
+    {
+      "id": "qwen2.5-coder",
+      "created": 1745327294,
+      "object": "model",
+      "owned_by": "Envoy AI Gateway"
+    }
+  ],
+  "object": "list"
+}
+```
+
+III. Query `http://localhost:8080/v1/chat/completions` to chat with the model. Here, we ask the `qwen2-0.5b` model, the query will look like:
+
+```bash
+curl -H "Content-Type: application/json"     -d '{
+        "model": "qwen2-0.5b",
+        "messages": [
+            {
+                "role": "system",
+                "content": "Hi."
+            }
+        ]
+    }'     http://localhost:8080/v1/chat/completions | jq .
+```
+
+Expected response will look like this:
+
+```json
+{
+  "choices": [
+    {
+      "finish_reason": "stop",
+      "index": 0,
+      "message": {
+        "role": "assistant",
+        "content": "Hello! How can I assist you today?"
+      }
+    }
+  ],
+  "created": 1745327371,
+  "model": "qwen2-0.5b",
+  "system_fingerprint": "b5124-bc091a4d",
+  "object": "chat.completion",
+  "usage": {
+    "completion_tokens": 10,
+    "prompt_tokens": 10,
+    "total_tokens": 20
+  },
+  "id": "chatcmpl-AODlT8xnf4OjJwpQH31XD4yehHLnurr0",
+  "timings": {
+    "prompt_n": 1,
+    "prompt_ms": 319.876,
+    "prompt_per_token_ms": 319.876,
+    "prompt_per_second": 3.1262114069201816,
+    "predicted_n": 10,
+    "predicted_ms": 1309.393,
+    "predicted_per_token_ms": 130.9393,
+    "predicted_per_second": 7.63712651587415
+  }
+}
+```
diff --git a/docs/examples/envoy-ai-gateway/README.md b/docs/examples/envoy-ai-gateway/README.md
deleted file mode 100644
index 1222dacd..00000000
--- a/docs/examples/envoy-ai-gateway/README.md
+++ /dev/null
@@ -1,101 +0,0 @@
-# Envoy AI Gateway
-
-[Envoy AI Gateway](https://aigateway.envoyproxy.io/) is an open source project for using Envoy Gateway
-to handle request traffic from application clients to Generative AI services.
-
-## How to use
-
-### 1. Enable Envoy Gateway and Envoy AI Gateway in llmaz Helm
-
-Enable Envoy Gateway and Envoy AI Gateway in the `values.global.yaml` file, envoy gateway and envoy ai gateway are disabled by default.
-
-```yaml
-envoy-gateway:
-    enabled: true
-envoy-ai-gateway:
-    enabled: true
-```
-
-Note: [Envoy Gateway installation](https://gateway.envoyproxy.io/latest/install/install-helm/) and [Envoy AI Gateway installation](https://aigateway.envoyproxy.io/docs/getting-started/) can be done standalone.
-
-### 2. Check Envoy Gateway and Envoy AI Gateway
-
-Run `kubectl wait --timeout=5m -n envoy-gateway-system deployment/envoy-gateway --for=condition=Available` to wait for the envoy gateway to be ready.
-
-Run `kubectl wait --timeout=2m -n envoy-ai-gateway-system deployment/ai-gateway-controller --for=condition=Available` to wait for the envoy ai gateway to be ready.
-
-### 3. Basic AI Gateway example
-
-To expose your model(Playground) to Envoy Gateway, you need to create a GatewayClass, Gateway, and AIGatewayRoute. The following example shows how to do this.
-
-Example [qwen playground](docs/examples/llamacpp/playground.yaml) configuration for a basic AI Gateway.
-The model name is `qwen2-0.5b`, so the backend ref name is `qwen2-0--5b`, and the model lb service: `qwen2-0--5b-lb`
-- Playground in [docs/examples/llamacpp/playground.yaml](docs/examples/llamacpp/playground.yaml)
-- GatewayClass in [docs/examples/envoy-ai-gateway/basic.yaml](docs/examples/envoy-ai-gateway/basic.yaml)
-
-Check if the gateway pod to be ready:
-
-```bash
-kubectl wait pods --timeout=2m \
-    -l gateway.envoyproxy.io/owning-gateway-name=envoy-ai-gateway-basic \
-    -n envoy-gateway-system \
-    --for=condition=Ready
-```
-
-### 4. Check Envoy AI Gateway APIs
-
-- For local test with port forwarding, use `export GATEWAY_URL="http://localhost:8080"`. 
-- Using external IP, use `export GATEWAY_URL=$(kubectl get gateway/envoy-ai-gateway-basic -o jsonpath='{.status.addresses[0].value}')`
-
-See https://aigateway.envoyproxy.io/docs/getting-started/basic-usage for more details.
-
-`$GATEWAY_URL/v1/models` will show the models that are available in the Envoy AI Gateway. The response will look like this:
-
-```json
-{
-  "data": [
-    {
-      "id": "some-cool-self-hosted-model",
-      "created": 1744880950,
-      "object": "model",
-      "owned_by": "Envoy AI Gateway"
-    },
-    {
-      "id": "qwen2-0.5b",
-      "created": 1744880950,
-      "object": "model",
-      "owned_by": "Envoy AI Gateway"
-    }
-  ],
-  "object": "list"
-}
-```
-
-`$GATEWAY_URL/v1/chat/completions` will show the chat completions for the model. The request will look like this:
-
-```bash
-curl -H "Content-Type: application/json"     -d '{
-        "model": "qwen2-0.5b",
-        "messages": [
-            {
-                "role": "system",
-                "content": "Hi."
-            }
-        ]
-    }'     $GATEWAY_URL/v1/chat/completions
-```
-
-Expected response will look like this:
-
-```json
-{
-    "choices": [
-        {
-            "message": {
-                "content": "I'll be back."
-            }
-        }
-    ]
-}
-```
-
diff --git a/docs/examples/envoy-ai-gateway/basic.yaml b/docs/examples/envoy-ai-gateway/basic.yaml
index 2e2f79e1..0e5094b5 100644
--- a/docs/examples/envoy-ai-gateway/basic.yaml
+++ b/docs/examples/envoy-ai-gateway/basic.yaml
@@ -1,17 +1,67 @@
+apiVersion: llmaz.io/v1alpha1
+kind: OpenModel
+metadata:
+  name: qwen2-0--5b
+spec:
+  familyName: qwen2
+  source:
+    modelHub:
+      modelID: Qwen/Qwen2-0.5B-Instruct-GGUF
+      filename: qwen2-0_5b-instruct-q5_k_m.gguf
+---
+apiVersion: inference.llmaz.io/v1alpha1
+kind: Playground
+metadata:
+  name: qwen2-0--5b
+spec:
+  replicas: 1
+  modelClaim:
+    modelName: qwen2-0--5b
+  backendRuntimeConfig:
+    backendName: llamacpp
+    configName: default
+    args:
+      - -fa # use flash attention
+---
+apiVersion: llmaz.io/v1alpha1
+kind: OpenModel
+metadata:
+  name: qwen2--5-coder
+spec:
+  familyName: qwen2
+  source:
+    modelHub:
+      modelID: Qwen/Qwen2.5-Coder-0.5B-Instruct-GGUF
+      filename: qwen2.5-coder-0.5b-instruct-q2_k.gguf
+---
+apiVersion: inference.llmaz.io/v1alpha1
+kind: Playground
+metadata:
+  name: qwen2--5-coder
+spec:
+  replicas: 1
+  modelClaim:
+    modelName: qwen2--5-coder
+  backendRuntimeConfig:
+    backendName: llamacpp
+    configName: default
+    args:
+      - -fa # use flash attention
+---
 apiVersion: gateway.networking.k8s.io/v1
 kind: GatewayClass
 metadata:
-  name: envoy-ai-gateway-basic
+  name: default-envoy-ai-gateway
 spec:
   controllerName: gateway.envoyproxy.io/gatewayclass-controller
 ---
 apiVersion: gateway.networking.k8s.io/v1
 kind: Gateway
 metadata:
-  name: envoy-ai-gateway-basic
+  name: default-envoy-ai-gateway
   namespace: default
 spec:
-  gatewayClassName: envoy-ai-gateway-basic
+  gatewayClassName: default-envoy-ai-gateway
   listeners:
     - name: http
       protocol: HTTP
@@ -20,35 +70,57 @@ spec:
 apiVersion: aigateway.envoyproxy.io/v1alpha1
 kind: AIGatewayRoute
 metadata:
-  name: envoy-ai-gateway-basic
+  name: default-envoy-ai-gateway
   namespace: default
 spec:
   schema:
     name: OpenAI
   targetRefs:
-    - name: envoy-ai-gateway-basic
+    - name: default-envoy-ai-gateway
       kind: Gateway
       group: gateway.networking.k8s.io
   rules:
-
-# Above are basic config for envoy ai gateway
-# Below is example for qwen2-0.5b: a matched backend ref and the AIServiceBackend
     - matches:
         - headers:
             - type: Exact
               name: x-ai-eg-model
               value: qwen2-0.5b
       backendRefs:
-        - name: envoy-ai-gateway-llmaz-model-1
+        - name: qwen2-0--5b
+    - matches:
+        - headers:
+            - type: Exact
+              name: x-ai-eg-model
+              value: qwen2.5-coder
+      backendRefs:
+        - name: qwen2--5-coder
 ---
 apiVersion: aigateway.envoyproxy.io/v1alpha1
 kind: AIServiceBackend
 metadata:
-  name: envoy-ai-gateway-llmaz-model-1
+  name: qwen2-0--5b
   namespace: default
 spec:
+  timeouts:
+    request: 3m
   schema:
     name: OpenAI
   backendRef:
     name: qwen2-0--5b-lb
-    kind: Service
\ No newline at end of file
+    kind: Service
+    port: 8080
+---
+apiVersion: aigateway.envoyproxy.io/v1alpha1
+kind: AIServiceBackend
+metadata:
+  name: qwen2--5-coder
+  namespace: default
+spec:
+  timeouts:
+    request: 3m
+  schema:
+    name: OpenAI
+  backendRef:
+    name: qwen2--5-coder-lb
+    kind: Service
+    port: 8080
diff --git a/docs/examples/envoy-ai-gateway/envoy-ai-gateway.md b/docs/examples/envoy-ai-gateway/envoy-ai-gateway.md
deleted file mode 100644
index 5681d61a..00000000
--- a/docs/examples/envoy-ai-gateway/envoy-ai-gateway.md
+++ /dev/null
@@ -1,102 +0,0 @@
-# Envoy AI Gateway
-
-[Envoy AI Gateway](https://aigateway.envoyproxy.io/) is an open source project for using Envoy Gateway
-to handle request traffic from application clients to Generative AI services.
-
-## How to use
-
-### 1. Enable Envoy Gateway and Envoy AI Gateway in llmaz Helm
-
-Enable Envoy Gateway and Envoy AI Gateway in the `values.global.yaml` file, envoy gateway and envoy ai gateway are enabled by default.
-
-```yaml
-envoy-gateway:
-    enabled: true
-envoy-ai-gateway:
-    enabled: true
-```
-
-Note: [Envoy Gateway installation](https://gateway.envoyproxy.io/latest/install/install-helm/) and [Envoy AI Gateway installation](https://aigateway.envoyproxy.io/docs/getting-started/) can be done standalone.
-
-### 2. Check Envoy Gateway and Envoy AI Gateway
-
-Run `kubectl wait --timeout=5m -n envoy-gateway-system deployment/envoy-gateway --for=condition=Available` to wait for the envoy gateway to be ready.
-
-Run `kubectl wait --timeout=2m -n envoy-ai-gateway-system deployment/ai-gateway-controller --for=condition=Available` to wait for the envoy ai gateway to be ready.
-
-### 3. Basic AI Gateway example
-
-To expose your model(Playground) to Envoy Gateway, you need to create a GatewayClass, Gateway, and AIGatewayRoute. The following example shows how to do this.
-
-Example [qwen playground](docs/examples/llamacpp/playground.yaml) configuration for a basic AI Gateway.
-The model name is `qwen2-0.5b`, so the backend ref name is `qwen2-0--5b`, and the model lb service: `qwen2-0--5b-lb`
-
-- Playground in [docs/examples/llamacpp/playground.yaml](docs/examples/llamacpp/playground.yaml)
-- GatewayClass in [docs/examples/envoy-ai-gateway/basic.yaml](docs/examples/envoy-ai-gateway/basic.yaml)
-
-Check if the gateway pod to be ready:
-
-```bash
-kubectl wait pods --timeout=2m \
-    -l gateway.envoyproxy.io/owning-gateway-name=envoy-ai-gateway-basic \
-    -n envoy-gateway-system \
-    --for=condition=Ready
-```
-
-### 4. Check Envoy AI Gateway APIs
-
-- For local test with port forwarding, use `export GATEWAY_URL="http://localhost:8080"`. 
-- Using external IP, use `export GATEWAY_URL=$(kubectl get gateway/envoy-ai-gateway-basic -o jsonpath='{.status.addresses[0].value}')`
-
-See https://aigateway.envoyproxy.io/docs/getting-started/basic-usage for more details.
-
-`$GATEWAY_URL/v1/models` will show the models that are available in the Envoy AI Gateway. The response will look like this:
-
-```json
-{
-  "data": [
-    {
-      "id": "some-cool-self-hosted-model",
-      "created": 1744880950,
-      "object": "model",
-      "owned_by": "Envoy AI Gateway"
-    },
-    {
-      "id": "qwen2-0.5b",
-      "created": 1744880950,
-      "object": "model",
-      "owned_by": "Envoy AI Gateway"
-    }
-  ],
-  "object": "list"
-}
-```
-
-`$GATEWAY_URL/v1/chat/completions` will show the chat completions for the model. The request will look like this:
-
-```bash
-curl -H "Content-Type: application/json"     -d '{
-        "model": "qwen2-0.5b",
-        "messages": [
-            {
-                "role": "system",
-                "content": "Hi."
-            }
-        ]
-    }'     $GATEWAY_URL/v1/chat/completions
-```
-
-Expected response will look like this:
-
-```json
-{
-    "choices": [
-        {
-            "message": {
-                "content": "I'll be back."
-            }
-        }
-    ]
-}
-```
-
diff --git a/docs/installation.md b/docs/installation.md
index e9265f44..a3914868 100644
--- a/docs/installation.md
+++ b/docs/installation.md
@@ -2,10 +2,16 @@
 
 ## Prerequisites
 
+**Requirements**:
+
 - Kubernetes version >= 1.27
 - Helm 3, see [installation](https://helm.sh/docs/intro/install/).
 - Prometheus, see [installation](https://github.com/InftyAI/llmaz/tree/main/docs/prometheus-operator#install-the-prometheus-operator).
 
+Note: llmaz helm chart will by default install
+- [Envoy Gateway](https://github.com/envoyproxy/gateway) and [Envoy AI Gateway](https://github.com/envoyproxy/gateway) as the frontier in the llmaz-system, if you *already installed these two components* or *want to deploy in other namespaces* , append `--set envoy-gateway.enabled=false --set envoy-ai-gateway.enabled=false` to the command below.
+- [Open WebUI](https://github.com/open-webui/open-webui) as the default chatbot, if you want to disable it, append `--set open-webui.enabled=false` to the command below.
+
 ## Install a released version
 
 ### Install
@@ -35,6 +41,13 @@ kubectl delete crd \
 
 ## Install from source
 
+### Change configurations
+
+If you want to change the default configurations, please change the values in [values.global.yaml](../chart/values.global.yaml).
+
+**Do you change** the values in _values.yaml_ because it's auto-generated and will be overwritten.
+
+
 ### Install
 
 ```cmd
@@ -60,16 +73,6 @@ kubectl delete crd \
     services.inference.llmaz.io
 ```
 
-## Change configurations
-
-If you want to change the default configurations, please change the values in [values.global.yaml](../chart/values.global.yaml), then run
-
-```cmd
-make helm-install
-```
-
-**Do you change** the values in _values.yaml_ because it's auto-generated and will be overwritten.
-
 ## Upgrade
 
 Once you changed your code, run the command to upgrade the controller:
diff --git a/docs/open-webui.md b/docs/open-webui.md
index c673be08..d22f1534 100644
--- a/docs/open-webui.md
+++ b/docs/open-webui.md
@@ -5,11 +5,11 @@
 ## Prerequisites
 
 - Make sure you're located in **llmaz-system** namespace, haven't tested with other namespaces.
-- Make sure [EnvoyGateway](https://github.com/envoyproxy/gateway) and [Envoy AI Gateway](https://github.com/envoyproxy/ai-gateway) are installed, both of them are installed by default in llmaz. See [Envoy AI Gateway](docs/envoy-ai-gateway.md) for more details.
+- Make sure [EnvoyGateway](https://github.com/envoyproxy/gateway) and [Envoy AI Gateway](https://github.com/envoyproxy/ai-gateway) are installed, both of them are installed by default in llmaz. See [AI Gateway](docs/envoy-ai-gateway.md) for more details.
 
 ## How to use
 
-1. Enable Open WebUI in the `values.global.yaml` file, open-webui is disabled by default.
+1. Enable Open WebUI in the `values.global.yaml` file, open-webui is enabled by default.
 
     ```yaml
     open-webui:
@@ -18,7 +18,7 @@
 
     > Optional to set the `persistence=true` to persist the data, recommended for production.
 
-2. Run `kubectl get svc -n envoy-gateway-system` to list out the services, the output looks like:
+2. Run `kubectl get svc -n llmaz-system` to list out the services, the output looks like:
 
     ```cmd
     envoy-default-default-envoy-ai-gateway-dbec795a   LoadBalancer   10.96.145.150   <pending>     80:30548/TCP                              132m
@@ -30,7 +30,7 @@
     ```yaml
     open-webui:
       enabled: true
-      openaiBaseApiUrl: http://envoy-default-default-envoy-ai-gateway-dbec795a.envoy-gateway-system.svc.cluster.local/v1
+      openaiBaseApiUrl: http://envoy-default-default-envoy-ai-gateway-dbec795a.llmaz-system.svc.cluster.local/v1
     ```
 
 4. Run `make install-chatbot` to install the chatbot.