Skip to content

Commit 5758520

Browse files
authored
Update RAG to use Autopilot by default (GoogleCloudPlatform#635)
Remove DNS troubleshooting information, as this has been patched.
1 parent d5cb41c commit 5758520

File tree

4 files changed

+19
-24
lines changed

4 files changed

+19
-24
lines changed

applications/rag/README.md

+15-20
Original file line numberDiff line numberDiff line change
@@ -31,19 +31,17 @@ Install the following on your computer:
3131

3232
### Bring your own cluster (optional)
3333

34-
By default, this tutorial creates a Standard cluster on your behalf. We highly recommend following the default settings.
34+
By default, this tutorial creates a cluster on your behalf. We highly recommend following the default settings.
3535

3636
If you prefer to manage your own cluster, set `create_cluster = false` in the [Installation section](#installation). Creating a long-running cluster may be better for development, allowing you to iterate on Terraform components without recreating the cluster every time.
3737

38-
Use the provided infrastructue module to create a cluster:
38+
Use gcloud to create a GKE Autopilot cluster. Note that RAG requires the latest Autopilot features, available on the latest versions of 1.28 and 1.29.
3939

40-
1. `cd ai-on-gke/infrastructure`
41-
42-
2. Edit `platform.tfvars` to set your project ID, location and cluster name. The other fields are optional. Ensure you create an L4 nodepool as this tutorial requires it.
43-
44-
3. Run `terraform init`
45-
46-
4. Run `terraform apply`
40+
```
41+
gcloud container clusters create-auto rag-cluster \
42+
--location us-central1 \
43+
--cluster-version 1.28
44+
```
4745

4846
### Bring your own VPC (optional)
4947

@@ -64,10 +62,11 @@ This section sets up the RAG infrastructure in your GCP project using Terraform.
6462
1. `cd ai-on-gke/applications/rag`
6563

6664
2. Edit `workloads.tfvars` to set your project ID, location, cluster name, and GCS bucket name. Ensure the `gcs_bucket` name is globally unique (add a random suffix). Optionally, make the following changes:
67-
* (Optional) Set a custom `kubernetes_namespace` where all k8s resources will be created.
6865
* (Recommended) [Enable authenticated access](#configure-authenticated-access-via-iap) for JupyterHub, frontend chat and Ray dashboard services.
69-
* (Not recommended) Set `create_cluster = false` if you bring your own cluster. If using a GKE Standard cluster, ensure it has an L4 nodepool with autoscaling and node autoprovisioning enabled.
70-
* (Not recommended) Set `create_network = false` if you bring your own VPC. Ensure your VPC has Private Service Connect enabled as described above.
66+
* (Optional) Set a custom `kubernetes_namespace` where all k8s resources will be created.
67+
* (Optional) Set `autopilot_cluster = false` to deploy using GKE Standard.
68+
* (Optional) Set `create_cluster = false` if you are bringing your own cluster. If using a GKE Standard cluster, ensure it has an L4 nodepool with autoscaling and node autoprovisioning enabled. You can simplify setup by following the Terraform instructions in [`infrastructure/README.md`](https://github.com/GoogleCloudPlatform/ai-on-gke/blob/main/infrastructure/README.md).
69+
* (Optional) Set `create_network = false` if you are bringing your own VPC. Ensure your VPC has Private Service Connect enabled as described above.
7170

7271
3. Run `terraform init`
7372

@@ -194,17 +193,13 @@ Connect to the GKE cluster:
194193
gcloud container clusters get-credentials ${CLUSTER_NAME} --location=${CLUSTER_LOCATION}
195194
```
196195

197-
1. Troubleshoot JupyterHub job failures:
198-
- If the JupyterHub job fails to start the proxy with error code 599, it is likely an known issue with Cloud DNS, which occurs when a cluster is quickly deleted and recreated with the same name.
199-
- Recreate the cluster with a different name or wait several minutes after running `terraform destroy` before running `terraform apply`.
200-
201-
2. Troubleshoot Ray job failures:
196+
1. Troubleshoot Ray job failures:
202197
- If the Ray actors fail to be scheduled, it could be due to a stockout or quota issue.
203198
- Run `kubectl get pods -n ${NAMESPACE} -l app.kubernetes.io/name=kuberay`. There should be a Ray head and Ray worker pod in `Running` state. If your ray pods aren't running, it's likely due to quota or stockout issues. Check that your project and selected `cluster_location` have L4 GPU capacity.
204199
- Often, retrying the Ray job submission (the last cell of the notebook) helps.
205200
- The Ray job may take 15-20 minutes to run the first time due to environment setup.
206201

207-
3. Troubleshoot IAP login issues:
202+
2. Troubleshoot IAP login issues:
208203
- Verify the cert is Active:
209204
- For JupyterHub `kubectl get managedcertificates jupyter-managed-cert -n ${NAMESPACE} --output jsonpath='{.status.domainStatus[0].status}'`
210205
- For the frontend: `kubectl get managedcertificates frontend-managed-cert -n rag --output jsonpath='{.status.domainStatus[0].status}'`
@@ -214,15 +209,15 @@ gcloud container clusters get-credentials ${CLUSTER_NAME} --location=${CLUSTER_L
214209
- Org error:
215210
- The [OAuth Consent Screen](https://developers.google.com/workspace/guides/configure-oauth-consent#configure_oauth_consent) has `User type` set to `Internal` by default, which means principals external to the org your project is in cannot log in. To add external principals, change `User type` to `External`.
216211

217-
4. Troubleshoot `terraform apply` failures:
212+
3. Troubleshoot `terraform apply` failures:
218213
- Inference server (`mistral`) fails to deploy:
219214
- This usually indicates a stockout/quota issue. Verify your project and chosen `cluster_location` have L4 capacity.
220215
- GCS bucket already exists:
221216
- GCS bucket names have to be globally unique, pick a different name with a random suffix.
222217
- Cloud SQL instance already exists:
223218
- Ensure the `cloudsql_instance` name doesn't already exist in your project.
224219

225-
5. Troubleshoot `terraform destroy` failures:
220+
4. Troubleshoot `terraform destroy` failures:
226221
- Network deletion issue:
227222
- `terraform destroy` fails to delete the network due to a known issue in the GCP provider. For now, the workaround is to manually delete it.
228223

applications/rag/metadata.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -28,8 +28,8 @@ spec:
2828
varType: string
2929
defaultValue: "created-by=gke-ai-quick-start-solutions,ai.gke.io=rag"
3030
- name: autopilot_cluster
31-
varType: string
32-
defaultValue: false
31+
varType: bool
32+
defaultValue: true
3333
- name: iap_consent_info
3434
description: Configure the <a href="https://developers.google.com/workspace/guides/configure-oauth-consent#configure_oauth_consent"><i>OAuth Consent Screen</i></a> for your project. Ensure <b>User type</b> is set to <i>Internal</i>. Note that by default, only users within your organization can be allowlisted. To add external users, change the <b>User type</b> to <i>External</i> after the application is deployed.
3535
varType: bool

applications/rag/variables.tf

+1-1
Original file line numberDiff line numberDiff line change
@@ -319,7 +319,7 @@ variable "private_cluster" {
319319

320320
variable "autopilot_cluster" {
321321
type = bool
322-
default = false
322+
default = true
323323
}
324324

325325
variable "cloudsql_instance" {

applications/rag/workloads.tfvars

+1-1
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ subnetwork_cidr = "10.100.0.0/16"
2020
create_cluster = true # Creates a GKE cluster in the specified network.
2121
cluster_name = "<cluster-name>"
2222
cluster_location = "us-central1"
23-
autopilot_cluster = false
23+
autopilot_cluster = true
2424
private_cluster = false
2525

2626
## GKE environment variables

0 commit comments

Comments
 (0)