Skip to content

Commit e7b191a

Browse files
Robert Baileyartemvmin
Robert Bailey
andauthored
Cherry-pick #635 to release-1.1 branch (#637)
Update RAG to use Autopilot by default (#635) Remove DNS troubleshooting information, as this has been patched. Co-authored-by: artemvmin <[email protected]>
1 parent 422073b commit e7b191a

File tree

4 files changed

+19
-25
lines changed

4 files changed

+19
-25
lines changed

applications/rag/README.md

+15-21
Original file line numberDiff line numberDiff line change
@@ -31,20 +31,17 @@ Install the following on your computer:
3131

3232
### Bring your own cluster (optional)
3333

34-
By default, this tutorial creates a Standard cluster on your behalf. We highly recommend following the default settings.
34+
By default, this tutorial creates a cluster on your behalf. We highly recommend following the default settings.
3535

3636
If you prefer to manage your own cluster, set `create_cluster = false` in the [Installation section](#installation). Creating a long-running cluster may be better for development, allowing you to iterate on Terraform components without recreating the cluster every time.
3737

38-
Use the provided infrastructue module to create a cluster:
39-
40-
1. `cd ai-on-gke/infrastructure`
41-
42-
2. Edit `platform.tfvars` to set your project ID, location and cluster name. The other fields are optional. Ensure you create an L4 nodepool as this tutorial requires it.
43-
44-
3. Run `terraform init`
45-
46-
4. Run `terraform apply --var-file workloads.tfvars`
38+
Use gcloud to create a GKE Autopilot cluster. Note that RAG requires the latest Autopilot features, available on the latest versions of 1.28 and 1.29.
4739

40+
```
41+
gcloud container clusters create-auto rag-cluster \
42+
--location us-central1 \
43+
--cluster-version 1.28
44+
```
4845
### Bring your own VPC (optional)
4946

5047
By default, this tutorial creates a new network on your behalf with [Private Service Connect](https://cloud.google.com/vpc/docs/private-service-connect) already enabled. We highly recommend following the default settings.
@@ -64,10 +61,11 @@ This section sets up the RAG infrastructure in your GCP project using Terraform.
6461
1. `cd ai-on-gke/applications/rag`
6562

6663
2. Edit `workloads.tfvars` to set your project ID, location, cluster name, and GCS bucket name. Ensure the `gcs_bucket` name is globally unique (add a random suffix). Optionally, make the following changes:
67-
* (Optional) Set a custom `kubernetes_namespace` where all k8s resources will be created.
6864
* (Recommended) [Enable authenticated access](#configure-authenticated-access-via-iap) for JupyterHub, frontend chat and Ray dashboard services.
69-
* (Not recommended) Set `create_cluster = false` if you bring your own cluster. If using a GKE Standard cluster, ensure it has an L4 nodepool with autoscaling and node autoprovisioning enabled.
70-
* (Not recommended) Set `create_network = false` if you bring your own VPC. Ensure your VPC has Private Service Connect enabled as described above.
65+
* (Optional) Set a custom `kubernetes_namespace` where all k8s resources will be created.
66+
* (Optional) Set `autopilot_cluster = false` to deploy using GKE Standard.
67+
* (Optional) Set `create_cluster = false` if you are bringing your own cluster. If using a GKE Standard cluster, ensure it has an L4 nodepool with autoscaling and node autoprovisioning enabled. You can simplify setup by following the Terraform instructions in [`infrastructure/README.md`](https://github.com/GoogleCloudPlatform/ai-on-gke/blob/main/infrastructure/README.md).
68+
* (Optional) Set `create_network = false` if you are bringing your own VPC. Ensure your VPC has Private Service Connect enabled as described above.
7169

7270
3. Run `terraform init`
7371

@@ -193,17 +191,13 @@ Connect to the GKE cluster:
193191
gcloud container clusters get-credentials ${CLUSTER_NAME} --location=${CLUSTER_LOCATION}
194192
```
195193

196-
1. Troubleshoot JupyterHub job failures:
197-
- If the JupyterHub job fails to start the proxy with error code 599, it is likely an known issue with Cloud DNS, which occurs when a cluster is quickly deleted and recreated with the same name.
198-
- Recreate the cluster with a different name or wait several minutes after running `terraform destroy` before running `terraform apply`.
199-
200-
2. Troubleshoot Ray job failures:
194+
1. Troubleshoot Ray job failures:
201195
- If the Ray actors fail to be scheduled, it could be due to a stockout or quota issue.
202196
- Run `kubectl get pods -n ${NAMESPACE} -l app.kubernetes.io/name=kuberay`. There should be a Ray head and Ray worker pod in `Running` state. If your ray pods aren't running, it's likely due to quota or stockout issues. Check that your project and selected `cluster_location` have L4 GPU capacity.
203197
- Often, retrying the Ray job submission (the last cell of the notebook) helps.
204198
- The Ray job may take 15-20 minutes to run the first time due to environment setup.
205199

206-
3. Troubleshoot IAP login issues:
200+
2. Troubleshoot IAP login issues:
207201
- Verify the cert is Active:
208202
- For JupyterHub `kubectl get managedcertificates jupyter-managed-cert -n ${NAMESPACE} --output jsonpath='{.status.domainStatus[0].status}'`
209203
- For the frontend: `kubectl get managedcertificates frontend-managed-cert -n rag --output jsonpath='{.status.domainStatus[0].status}'`
@@ -213,14 +207,14 @@ gcloud container clusters get-credentials ${CLUSTER_NAME} --location=${CLUSTER_L
213207
- Org error:
214208
- The [OAuth Consent Screen](https://developers.google.com/workspace/guides/configure-oauth-consent#configure_oauth_consent) has `User type` set to `Internal` by default, which means principals external to the org your project is in cannot log in. To add external principals, change `User type` to `External`.
215209

216-
4. Troubleshoot `terraform apply` failures:
210+
3. Troubleshoot `terraform apply` failures:
217211
- Inference server (`mistral`) fails to deploy:
218212
- This usually indicates a stockout/quota issue. Verify your project and chosen `cluster_location` have L4 capacity.
219213
- GCS bucket already exists:
220214
- GCS bucket names have to be globally unique, pick a different name with a random suffix.
221215
- Cloud SQL instance already exists:
222216
- Ensure the `cloudsql_instance` name doesn't already exist in your project.
223217

224-
5. Troubleshoot `terraform destroy` failures:
218+
4. Troubleshoot `terraform destroy` failures:
225219
- Network deletion issue:
226220
- `terraform destroy` fails to delete the network due to a known issue in the GCP provider. For now, the workaround is to manually delete it.

applications/rag/metadata.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -28,8 +28,8 @@ spec:
2828
varType: string
2929
defaultValue: "created-by=gke-ai-quick-start-solutions,ai.gke.io=rag"
3030
- name: autopilot_cluster
31-
varType: string
32-
defaultValue: false
31+
varType: bool
32+
defaultValue: true
3333
- name: iap_consent_info
3434
description: Configure the <a href="https://developers.google.com/workspace/guides/configure-oauth-consent#configure_oauth_consent"><i>OAuth Consent Screen</i></a> for your project. Ensure <b>User type</b> is set to <i>Internal</i>. Note that by default, only users within your organization can be allowlisted. To add external users, change the <b>User type</b> to <i>External</i> after the application is deployed.
3535
varType: bool

applications/rag/variables.tf

+1-1
Original file line numberDiff line numberDiff line change
@@ -319,7 +319,7 @@ variable "private_cluster" {
319319

320320
variable "autopilot_cluster" {
321321
type = bool
322-
default = false
322+
default = true
323323
}
324324

325325
variable "cloudsql_instance" {

applications/rag/workloads.tfvars

+1-1
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ subnetwork_cidr = "10.100.0.0/16"
2020
create_cluster = true # Creates a GKE cluster in the specified network.
2121
cluster_name = "<cluster-name>"
2222
cluster_location = "us-central1"
23-
autopilot_cluster = false
23+
autopilot_cluster = true
2424
private_cluster = false
2525

2626
## GKE environment variables

0 commit comments

Comments
 (0)