|
| 1 | +# Kubernetes Ingress to Gateway API Migration Guide |
| 2 | + |
| 3 | +## 1. Install Gateway API CRD |
| 4 | +The Kubernetes Gateway API is a newer, more flexible and standardized way to manage traffic ingress and egress in Kubernetes clusters. KServe Implements the Gateway API version `1.2.1`. |
| 5 | + |
| 6 | +The Gateway API is not part of the Kubernetes cluster, therefore it needs to be installed manually, to do this, follow the next step. |
| 7 | +```shell |
| 8 | +kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.2.1/standard-install.yaml |
| 9 | +``` |
| 10 | + |
| 11 | +## 2. Create GatewayClass |
| 12 | +Create a `GatewayClass` resource using your preferred network controller. For this example, we will use [Envoy Gateway](https://gateway.envoyproxy.io/docs/) as the network controller. |
| 13 | + |
| 14 | +```yaml |
| 15 | +apiVersion: gateway.networking.k8s.io/v1 |
| 16 | +kind: GatewayClass |
| 17 | +metadata: |
| 18 | + name: envoy |
| 19 | +spec: |
| 20 | + controllerName: gateway.envoyproxy.io/gatewayclass-controller |
| 21 | +``` |
| 22 | +
|
| 23 | +## 3. Enable Gateway API |
| 24 | +To enable Gateway API support in KServe you need to set the `enableGatewayApi` to `true` in the `inferenceservice-config` ConfigMap. |
| 25 | + |
| 26 | +=== "Helm" |
| 27 | + |
| 28 | + ```shell |
| 29 | + helm upgrade kserve oci://ghcr.io/kserve/charts/kserve --version v{{ kserve_release_version }} \ |
| 30 | + --set kserve.controller.gateway.ingressGateway.enableGatewayApi=true |
| 31 | + ``` |
| 32 | + |
| 33 | +=== "Kubectl" |
| 34 | + |
| 35 | + ```shell |
| 36 | + kubectl edit configmap inferenceservice-config -n kserve |
| 37 | + ``` |
| 38 | + ```yaml |
| 39 | + data: |
| 40 | + ingress: |- |
| 41 | + { |
| 42 | + "enableGatewayApi": true, |
| 43 | + } |
| 44 | + ``` |
| 45 | + |
| 46 | +## 4. Create Gateway resource |
| 47 | +Create a `Gateway` resource to expose the `InferenceService`. In this example, we will use the `envoy` `GatewayClass` that was created in [step 2](#2-create-gatewayclass). If you already have a `Gateway` resource, you can skip this step. |
| 48 | + |
| 49 | +```yaml |
| 50 | +apiVersion: gateway.networking.k8s.io/v1 |
| 51 | +kind: Gateway |
| 52 | +metadata: |
| 53 | + name: kserve-ingress-gateway |
| 54 | + namespace: kserve |
| 55 | +spec: |
| 56 | + gatewayClassName: envoy |
| 57 | + listeners: |
| 58 | + - name: http |
| 59 | + protocol: HTTP |
| 60 | + port: 80 |
| 61 | + allowedRoutes: |
| 62 | + namespaces: |
| 63 | + from: All |
| 64 | + - name: https |
| 65 | + protocol: HTTPS |
| 66 | + port: 443 |
| 67 | + tls: |
| 68 | + mode: Terminate |
| 69 | + certificateRefs: |
| 70 | + - kind: Secret |
| 71 | + name: my-secret |
| 72 | + namespace: kserve |
| 73 | + allowedRoutes: |
| 74 | + namespaces: |
| 75 | + from: All |
| 76 | + infrastructure: |
| 77 | + labels: |
| 78 | + serving.kserve.io/gateway: kserve-ingress-gateway |
| 79 | +``` |
| 80 | + |
| 81 | +This should create a gateway instance pod and a LoadBalancer service. |
| 82 | +```shell |
| 83 | +kubectl get pods,svc -l serving.kserve.io/gateway=kserve-ingress-gateway -A |
| 84 | +``` |
| 85 | + |
| 86 | +!!! success "Expected Output" |
| 87 | +```shell |
| 88 | +NAMESPACE NAME READY STATUS RESTARTS AGE |
| 89 | +envoy-gateway-system pod/envoy-kserve-kserve-ingress-gateway-deaaa49b-6679ddc496-dlqfs 2/2 Running 0 3m52s |
| 90 | +
|
| 91 | +NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE |
| 92 | +envoy-gateway-system service/envoy-kserve-kserve-ingress-gateway-deaaa49b LoadBalancer 10.98.163.134 10.98.163.134 80:32390/TCP,443:32001/TCP 3m52s |
| 93 | +``` |
| 94 | + |
| 95 | +!!! note |
| 96 | + KServe can automatically create a default `Gateway` named `kserve-ingress-gateway` during installation if the Helm value `kserve.controller.gateway.ingressGateway.createGateway` set to `true`. If you choose to use this default gateway, you can skip this step and proceed to [step 6](#6-restart-the-kserve-controller). |
| 97 | + |
| 98 | +## 5. Configure the Gateway name and namespace in KServe |
| 99 | +In the ConfigMap `inferenceservice-config` modify the `kserveIngressGateway` in the `ingress` section with `gateway namespace` and `name` respecting the format `<gateway namespace>/<gateway name>`. In this example, we will use the `Gateway` resource that was created in [step 4](#4-create-gateway-resource). |
| 100 | + |
| 101 | +=== "Helm" |
| 102 | + |
| 103 | + ```shell |
| 104 | + helm upgrade kserve oci://ghcr.io/kserve/charts/kserve --version v{{ kserve_release_version }} \ |
| 105 | + --set kserve.controller.gateway.ingressGateway.kserveGateway=kserve/kserve-ingress-gateway |
| 106 | + ``` |
| 107 | + |
| 108 | +=== "Kubectl" |
| 109 | + |
| 110 | + ```shell |
| 111 | + kubectl edit configmap inferenceservice-config -n kserve |
| 112 | + ``` |
| 113 | + ```yaml |
| 114 | + data: |
| 115 | + ingress: |- |
| 116 | + { |
| 117 | + "kserveIngressGateway": "kserve/kserve-ingress-gateway", |
| 118 | + } |
| 119 | + ``` |
| 120 | + |
| 121 | +## 6. Restart the KServe controller |
| 122 | +The existing InferenceServices will not use the Gateway API configuration until the next reconciliation. |
| 123 | +You can restart the KServe controller to trigger the reconciliation and apply the Gateway API configuration to all the existing InferenceServices. |
| 124 | +```shell |
| 125 | +kubectl rollout restart deployment kserve-controller-manager -n kserve |
| 126 | +``` |
| 127 | + |
| 128 | +## 7. Configure the external traffic |
| 129 | +If you are using a cloud provider, you may need to configure the external traffic to the LoadBalancer service created in [step 4](#4-create-gateway-resource). |
| 130 | + |
| 131 | +```shell |
| 132 | +kubectl get svc kserve-ingress-gateway -l -A |
| 133 | +``` |
| 134 | + |
| 135 | +## 8. Verify the Gateway API configuration |
| 136 | +Create an InferenceService to verify that the Gateway API configuration is applied to the InferenceService. |
| 137 | + |
| 138 | +```yaml |
| 139 | +kubectl apply -f - <<EOF |
| 140 | +apiVersion: "serving.kserve.io/v1beta1" |
| 141 | +kind: "InferenceService" |
| 142 | +metadata: |
| 143 | + name: "sklearn-v2-iris" |
| 144 | +spec: |
| 145 | + predictor: |
| 146 | + model: |
| 147 | + modelFormat: |
| 148 | + name: sklearn |
| 149 | + protocolVersion: v2 |
| 150 | + runtime: kserve-sklearnserver |
| 151 | + storageUri: "gs://kfserving-examples/models/sklearn/1.0/model" |
| 152 | +EOF |
| 153 | +``` |
| 154 | + |
| 155 | +Execute the following command to determine if the Kubernetes cluster is running in an environment that supports external load balancers |
| 156 | +```shell |
| 157 | +kubectl get svc kserve-ingress-gateway -l serving.kserve.io/gateway=kserve-ingress-gateway -A |
| 158 | +``` |
| 159 | + |
| 160 | +!!! success "Expected Output" |
| 161 | + ```shell |
| 162 | + NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE |
| 163 | + envoy-gateway-system envoy-kserve-kserve-ingress-gateway-deaaa49b LoadBalancer 10.98.163.134 10.98.163.134 80:32390/TCP,443:32001/TCP 54s |
| 164 | + ``` |
| 165 | + |
| 166 | +=== "Load Balancer" |
| 167 | + If the EXTERNAL-IP value is set, your environment has an external load balancer that you can use for the ingress gateway. |
| 168 | + |
| 169 | + ```bash |
| 170 | + export INGRESS_HOST=$(kubectl get service -l serving.kserve.io/gateway=kserve-ingress-gateway -A -o jsonpath='{.items[0].status.loadBalancer.ingress[0].ip}') |
| 171 | + export INGRESS_PORT=$(kubectl get service -l serving.kserve.io/gateway=kserve-ingress-gateway -A -o jsonpath='{.items[0].spec.ports[?(@.name=="http2")].port}') |
| 172 | + ``` |
| 173 | + |
| 174 | +=== "Node Port" |
| 175 | + If the EXTERNAL-IP value is none (or perpetually pending), your environment does not provide an external load balancer for the ingress gateway. |
| 176 | + In this case, you can access the gateway using the service’s node port. |
| 177 | + ```bash |
| 178 | + # GKE |
| 179 | + export INGRESS_HOST=worker-node-address |
| 180 | + # Minikube |
| 181 | + export INGRESS_HOST=$(minikube ip) |
| 182 | + # Other environment(On Prem) |
| 183 | + export INGRESS_HOST=$(kubectl get po -l serving.kserve.io/gateway=kserve-ingress-gateway -A -o jsonpath='{.items[0].status.hostIP}') |
| 184 | + export INGRESS_PORT=$(kubectl get service -l serving.kserve.io/gateway=kserve-ingress-gateway -A -o jsonpath='{.items[0].spec.ports[?(@.name=="http-80")].nodePort}') |
| 185 | + ``` |
| 186 | + |
| 187 | +=== "Port Forward" |
| 188 | + Alternatively you can do `Port Forward` for testing purposes. |
| 189 | + ```bash |
| 190 | + INGRESS_GATEWAY_SERVICE=$(kubectl get svc -l serving.kserve.io/gateway=kserve-ingress-gateway -A --output jsonpath='{.items[0].metadata.name}') |
| 191 | + INGRESS_GATEWAY_NAMESPACE=$(kubectl get svc -l serving.kserve.io/gateway=kserve-ingress-gateway -A --output jsonpath='{.items[0].metadata.namespace}') |
| 192 | + kubectl port-forward --namespace ${INGRESS_GATEWAY_NAMESPACE} svc/${INGRESS_GATEWAY_SERVICE} 8080:80 |
| 193 | + ``` |
| 194 | + Open another terminal, and enter the following to perform inference: |
| 195 | + ```bash |
| 196 | + export INGRESS_HOST=localhost |
| 197 | + export INGRESS_PORT=8080 |
| 198 | + ``` |
| 199 | + |
| 200 | +Create a file named `iris-input-v2.json` with the sample input. |
| 201 | +```json |
| 202 | +{ |
| 203 | + "inputs": [ |
| 204 | + { |
| 205 | + "name": "input-0", |
| 206 | + "shape": [2, 4], |
| 207 | + "datatype": "FP32", |
| 208 | + "data": [ |
| 209 | + [6.8, 2.8, 4.8, 1.4], |
| 210 | + [6.0, 3.4, 4.5, 1.6] |
| 211 | + ] |
| 212 | + } |
| 213 | + ] |
| 214 | +} |
| 215 | +``` |
| 216 | +Now, verify the InferenceService is accessible outside the cluster using `curl`. |
| 217 | +```shell |
| 218 | +SERVICE_HOSTNAME=$(kubectl get inferenceservice sklearn-v2-iris -o jsonpath='{.status.url}' | cut -d "/" -f 3) |
| 219 | +
|
| 220 | +curl -v \ |
| 221 | + -H "Host: ${SERVICE_HOSTNAME}" \ |
| 222 | + -H "Content-Type: application/json" \ |
| 223 | + -d @./iris-input-v2.json \ |
| 224 | + http://${INGRESS_HOST}:${INGRESS_PORT}/v2/models/sklearn-v2-iris/infer |
| 225 | +``` |
| 226 | + |
| 227 | +!!! success "Expected Output" |
| 228 | + ```json |
| 229 | + { |
| 230 | + "id": "823248cc-d770-4a51-9606-16803395569c", |
| 231 | + "model_name": "sklearn-v2-iris", |
| 232 | + "outputs": [ |
| 233 | + { |
| 234 | + "data": [1, 1], |
| 235 | + "datatype": "INT64", |
| 236 | + "name": "predict", |
| 237 | + "parameters": null, |
| 238 | + "shape": [2] |
| 239 | + } |
| 240 | + ] |
| 241 | + } |
| 242 | + ``` |
0 commit comments