You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Update RAG to use Autopilot by default (#635)
Remove DNS troubleshooting information, as this has been patched.
Co-authored-by: artemvmin <[email protected]>
Copy file name to clipboardExpand all lines: applications/rag/README.md
+15-21
Original file line number
Diff line number
Diff line change
@@ -31,20 +31,17 @@ Install the following on your computer:
31
31
32
32
### Bring your own cluster (optional)
33
33
34
-
By default, this tutorial creates a Standard cluster on your behalf. We highly recommend following the default settings.
34
+
By default, this tutorial creates a cluster on your behalf. We highly recommend following the default settings.
35
35
36
36
If you prefer to manage your own cluster, set `create_cluster = false` in the [Installation section](#installation). Creating a long-running cluster may be better for development, allowing you to iterate on Terraform components without recreating the cluster every time.
37
37
38
-
Use the provided infrastructue module to create a cluster:
39
-
40
-
1.`cd ai-on-gke/infrastructure`
41
-
42
-
2. Edit `platform.tfvars` to set your project ID, location and cluster name. The other fields are optional. Ensure you create an L4 nodepool as this tutorial requires it.
43
-
44
-
3. Run `terraform init`
45
-
46
-
4. Run `terraform apply --var-file workloads.tfvars`
38
+
Use gcloud to create a GKE Autopilot cluster. Note that RAG requires the latest Autopilot features, available on the latest versions of 1.28 and 1.29.
By default, this tutorial creates a new network on your behalf with [Private Service Connect](https://cloud.google.com/vpc/docs/private-service-connect) already enabled. We highly recommend following the default settings.
@@ -64,10 +61,11 @@ This section sets up the RAG infrastructure in your GCP project using Terraform.
64
61
1.`cd ai-on-gke/applications/rag`
65
62
66
63
2. Edit `workloads.tfvars` to set your project ID, location, cluster name, and GCS bucket name. Ensure the `gcs_bucket` name is globally unique (add a random suffix). Optionally, make the following changes:
67
-
* (Optional) Set a custom `kubernetes_namespace` where all k8s resources will be created.
68
64
* (Recommended) [Enable authenticated access](#configure-authenticated-access-via-iap) for JupyterHub, frontend chat and Ray dashboard services.
69
-
* (Not recommended) Set `create_cluster = false` if you bring your own cluster. If using a GKE Standard cluster, ensure it has an L4 nodepool with autoscaling and node autoprovisioning enabled.
70
-
* (Not recommended) Set `create_network = false` if you bring your own VPC. Ensure your VPC has Private Service Connect enabled as described above.
65
+
* (Optional) Set a custom `kubernetes_namespace` where all k8s resources will be created.
66
+
* (Optional) Set `autopilot_cluster = false` to deploy using GKE Standard.
67
+
* (Optional) Set `create_cluster = false` if you are bringing your own cluster. If using a GKE Standard cluster, ensure it has an L4 nodepool with autoscaling and node autoprovisioning enabled. You can simplify setup by following the Terraform instructions in [`infrastructure/README.md`](https://github.com/GoogleCloudPlatform/ai-on-gke/blob/main/infrastructure/README.md).
68
+
* (Optional) Set `create_network = false` if you are bringing your own VPC. Ensure your VPC has Private Service Connect enabled as described above.
- If the JupyterHub job fails to start the proxy with error code 599, it is likely an known issue with Cloud DNS, which occurs when a cluster is quickly deleted and recreated with the same name.
198
-
- Recreate the cluster with a different name or wait several minutes after running `terraform destroy` before running `terraform apply`.
199
-
200
-
2. Troubleshoot Ray job failures:
194
+
1. Troubleshoot Ray job failures:
201
195
- If the Ray actors fail to be scheduled, it could be due to a stockout or quota issue.
202
196
- Run `kubectl get pods -n ${NAMESPACE} -l app.kubernetes.io/name=kuberay`. There should be a Ray head and Ray worker pod in `Running` state. If your ray pods aren't running, it's likely due to quota or stockout issues. Check that your project and selected `cluster_location` have L4 GPU capacity.
203
197
- Often, retrying the Ray job submission (the last cell of the notebook) helps.
204
198
- The Ray job may take 15-20 minutes to run the first time due to environment setup.
205
199
206
-
3. Troubleshoot IAP login issues:
200
+
2. Troubleshoot IAP login issues:
207
201
- Verify the cert is Active:
208
202
- For JupyterHub `kubectl get managedcertificates jupyter-managed-cert -n ${NAMESPACE} --output jsonpath='{.status.domainStatus[0].status}'`
209
203
- For the frontend: `kubectl get managedcertificates frontend-managed-cert -n rag --output jsonpath='{.status.domainStatus[0].status}'`
- The [OAuth Consent Screen](https://developers.google.com/workspace/guides/configure-oauth-consent#configure_oauth_consent) has `User type` set to `Internal` by default, which means principals external to the org your project is in cannot log in. To add external principals, change `User type` to `External`.
215
209
216
-
4. Troubleshoot `terraform apply` failures:
210
+
3. Troubleshoot `terraform apply` failures:
217
211
- Inference server (`mistral`) fails to deploy:
218
212
- This usually indicates a stockout/quota issue. Verify your project and chosen `cluster_location` have L4 capacity.
219
213
- GCS bucket already exists:
220
214
- GCS bucket names have to be globally unique, pick a different name with a random suffix.
221
215
- Cloud SQL instance already exists:
222
216
- Ensure the `cloudsql_instance` name doesn't already exist in your project.
223
217
224
-
5. Troubleshoot `terraform destroy` failures:
218
+
4. Troubleshoot `terraform destroy` failures:
225
219
- Network deletion issue:
226
220
-`terraform destroy` fails to delete the network due to a known issue in the GCP provider. For now, the workaround is to manually delete it.
description: Configure the <a href="https://developers.google.com/workspace/guides/configure-oauth-consent#configure_oauth_consent"><i>OAuth Consent Screen</i></a> for your project. Ensure <b>User type</b> is set to <i>Internal</i>. Note that by default, only users within your organization can be allowlisted. To add external users, change the <b>User type</b> to <i>External</i> after the application is deployed.
0 commit comments