Skip to content

Commit 31ebe48

Browse files
committed
Update RAG to use Autopilot by default
Remove DNS troubleshooting information, as this has been patched.
1 parent 931e087 commit 31ebe48

File tree

4 files changed

+15
-21
lines changed

4 files changed

+15
-21
lines changed

applications/rag/README.md

+11-17
Original file line numberDiff line numberDiff line change
@@ -31,19 +31,17 @@ Install the following on your computer:
3131

3232
### Bring your own cluster (optional)
3333

34-
By default, this tutorial creates a Standard cluster on your behalf. We highly recommend following the default settings.
34+
By default, this tutorial creates a cluster on your behalf. We highly recommend following the default settings.
3535

3636
If you prefer to manage your own cluster, set `create_cluster = false` in the [Installation section](#installation). Creating a long-running cluster may be better for development, allowing you to iterate on Terraform components without recreating the cluster every time.
3737

38-
Use the provided infrastructue module to create a cluster:
38+
Use gcloud to create a GKE Autopilot cluster. Note that RAG requires the latest Autopilot features, available on the latest versions of 1.28 and 1.29.
3939

40-
1. `cd ai-on-gke/infrastructure`
41-
42-
2. Edit `platform.tfvars` to set your project ID, location and cluster name. The other fields are optional. Ensure you create an L4 nodepool as this tutorial requires it.
43-
44-
3. Run `terraform init`
45-
46-
4. Run `terraform apply`
40+
```
41+
gcloud container clusters create-auto rag-cluster \
42+
--location us-central1 \
43+
--cluster-version 1.28
44+
```
4745

4846
### Bring your own VPC (optional)
4947

@@ -194,17 +192,13 @@ Connect to the GKE cluster:
194192
gcloud container clusters get-credentials ${CLUSTER_NAME} --location=${CLUSTER_LOCATION}
195193
```
196194

197-
1. Troubleshoot JupyterHub job failures:
198-
- If the JupyterHub job fails to start the proxy with error code 599, it is likely an known issue with Cloud DNS, which occurs when a cluster is quickly deleted and recreated with the same name.
199-
- Recreate the cluster with a different name or wait several minutes after running `terraform destroy` before running `terraform apply`.
200-
201-
2. Troubleshoot Ray job failures:
195+
1. Troubleshoot Ray job failures:
202196
- If the Ray actors fail to be scheduled, it could be due to a stockout or quota issue.
203197
- Run `kubectl get pods -n ${NAMESPACE} -l app.kubernetes.io/name=kuberay`. There should be a Ray head and Ray worker pod in `Running` state. If your ray pods aren't running, it's likely due to quota or stockout issues. Check that your project and selected `cluster_location` have L4 GPU capacity.
204198
- Often, retrying the Ray job submission (the last cell of the notebook) helps.
205199
- The Ray job may take 15-20 minutes to run the first time due to environment setup.
206200

207-
3. Troubleshoot IAP login issues:
201+
2. Troubleshoot IAP login issues:
208202
- Verify the cert is Active:
209203
- For JupyterHub `kubectl get managedcertificates jupyter-managed-cert -n ${NAMESPACE} --output jsonpath='{.status.domainStatus[0].status}'`
210204
- For the frontend: `kubectl get managedcertificates frontend-managed-cert -n rag --output jsonpath='{.status.domainStatus[0].status}'`
@@ -214,15 +208,15 @@ gcloud container clusters get-credentials ${CLUSTER_NAME} --location=${CLUSTER_L
214208
- Org error:
215209
- The [OAuth Consent Screen](https://developers.google.com/workspace/guides/configure-oauth-consent#configure_oauth_consent) has `User type` set to `Internal` by default, which means principals external to the org your project is in cannot log in. To add external principals, change `User type` to `External`.
216210

217-
4. Troubleshoot `terraform apply` failures:
211+
3. Troubleshoot `terraform apply` failures:
218212
- Inference server (`mistral`) fails to deploy:
219213
- This usually indicates a stockout/quota issue. Verify your project and chosen `cluster_location` have L4 capacity.
220214
- GCS bucket already exists:
221215
- GCS bucket names have to be globally unique, pick a different name with a random suffix.
222216
- Cloud SQL instance already exists:
223217
- Ensure the `cloudsql_instance` name doesn't already exist in your project.
224218

225-
5. Troubleshoot `terraform destroy` failures:
219+
4. Troubleshoot `terraform destroy` failures:
226220
- Network deletion issue:
227221
- `terraform destroy` fails to delete the network due to a known issue in the GCP provider. For now, the workaround is to manually delete it.
228222

applications/rag/metadata.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -28,8 +28,8 @@ spec:
2828
varType: string
2929
defaultValue: "created-by=gke-ai-quick-start-solutions,ai.gke.io=rag"
3030
- name: autopilot_cluster
31-
varType: string
32-
defaultValue: false
31+
varType: bool
32+
defaultValue: true
3333
- name: iap_consent_info
3434
description: Configure the <a href="https://developers.google.com/workspace/guides/configure-oauth-consent#configure_oauth_consent"><i>OAuth Consent Screen</i></a> for your project. Ensure <b>User type</b> is set to <i>Internal</i>. Note that by default, only users within your organization can be allowlisted. To add external users, change the <b>User type</b> to <i>External</i> after the application is deployed.
3535
varType: bool

applications/rag/variables.tf

+1-1
Original file line numberDiff line numberDiff line change
@@ -319,7 +319,7 @@ variable "private_cluster" {
319319

320320
variable "autopilot_cluster" {
321321
type = bool
322-
default = false
322+
default = true
323323
}
324324

325325
variable "cloudsql_instance" {

applications/rag/workloads.tfvars

+1-1
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ subnetwork_cidr = "10.100.0.0/16"
2020
create_cluster = true # Creates a GKE cluster in the specified network.
2121
cluster_name = "<cluster-name>"
2222
cluster_location = "us-central1"
23-
autopilot_cluster = false
23+
autopilot_cluster = true
2424
private_cluster = false
2525

2626
## GKE environment variables

0 commit comments

Comments
 (0)