You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: applications/rag/README.md
+15-20
Original file line number
Diff line number
Diff line change
@@ -31,19 +31,17 @@ Install the following on your computer:
31
31
32
32
### Bring your own cluster (optional)
33
33
34
-
By default, this tutorial creates a Standard cluster on your behalf. We highly recommend following the default settings.
34
+
By default, this tutorial creates a cluster on your behalf. We highly recommend following the default settings.
35
35
36
36
If you prefer to manage your own cluster, set `create_cluster = false` in the [Installation section](#installation). Creating a long-running cluster may be better for development, allowing you to iterate on Terraform components without recreating the cluster every time.
37
37
38
-
Use the provided infrastructue module to create a cluster:
38
+
Use gcloud to create a GKE Autopilot cluster. Note that RAG requires the latest Autopilot features, available on the latest versions of 1.28 and 1.29.
39
39
40
-
1.`cd ai-on-gke/infrastructure`
41
-
42
-
2. Edit `platform.tfvars` to set your project ID, location and cluster name. The other fields are optional. Ensure you create an L4 nodepool as this tutorial requires it.
@@ -64,10 +62,11 @@ This section sets up the RAG infrastructure in your GCP project using Terraform.
64
62
1.`cd ai-on-gke/applications/rag`
65
63
66
64
2. Edit `workloads.tfvars` to set your project ID, location, cluster name, and GCS bucket name. Ensure the `gcs_bucket` name is globally unique (add a random suffix). Optionally, make the following changes:
67
-
* (Optional) Set a custom `kubernetes_namespace` where all k8s resources will be created.
68
65
* (Recommended) [Enable authenticated access](#configure-authenticated-access-via-iap) for JupyterHub, frontend chat and Ray dashboard services.
69
-
* (Not recommended) Set `create_cluster = false` if you bring your own cluster. If using a GKE Standard cluster, ensure it has an L4 nodepool with autoscaling and node autoprovisioning enabled.
70
-
* (Not recommended) Set `create_network = false` if you bring your own VPC. Ensure your VPC has Private Service Connect enabled as described above.
66
+
* (Optional) Set a custom `kubernetes_namespace` where all k8s resources will be created.
67
+
* (Optional) Set `autopilot_cluster = false` to deploy using GKE Standard.
68
+
* (Optional) Set `create_cluster = false` if you are bringing your own cluster. If using a GKE Standard cluster, ensure it has an L4 nodepool with autoscaling and node autoprovisioning enabled. You can simplify setup by following the Terraform instructions in [`infrastructure/README.md`](https://github.com/GoogleCloudPlatform/ai-on-gke/blob/main/infrastructure/README.md).
69
+
* (Optional) Set `create_network = false` if you are bringing your own VPC. Ensure your VPC has Private Service Connect enabled as described above.
- If the JupyterHub job fails to start the proxy with error code 599, it is likely an known issue with Cloud DNS, which occurs when a cluster is quickly deleted and recreated with the same name.
199
-
- Recreate the cluster with a different name or wait several minutes after running `terraform destroy` before running `terraform apply`.
200
-
201
-
2. Troubleshoot Ray job failures:
196
+
1. Troubleshoot Ray job failures:
202
197
- If the Ray actors fail to be scheduled, it could be due to a stockout or quota issue.
203
198
- Run `kubectl get pods -n ${NAMESPACE} -l app.kubernetes.io/name=kuberay`. There should be a Ray head and Ray worker pod in `Running` state. If your ray pods aren't running, it's likely due to quota or stockout issues. Check that your project and selected `cluster_location` have L4 GPU capacity.
204
199
- Often, retrying the Ray job submission (the last cell of the notebook) helps.
205
200
- The Ray job may take 15-20 minutes to run the first time due to environment setup.
206
201
207
-
3. Troubleshoot IAP login issues:
202
+
2. Troubleshoot IAP login issues:
208
203
- Verify the cert is Active:
209
204
- For JupyterHub `kubectl get managedcertificates jupyter-managed-cert -n ${NAMESPACE} --output jsonpath='{.status.domainStatus[0].status}'`
210
205
- For the frontend: `kubectl get managedcertificates frontend-managed-cert -n rag --output jsonpath='{.status.domainStatus[0].status}'`
- The [OAuth Consent Screen](https://developers.google.com/workspace/guides/configure-oauth-consent#configure_oauth_consent) has `User type` set to `Internal` by default, which means principals external to the org your project is in cannot log in. To add external principals, change `User type` to `External`.
216
211
217
-
4. Troubleshoot `terraform apply` failures:
212
+
3. Troubleshoot `terraform apply` failures:
218
213
- Inference server (`mistral`) fails to deploy:
219
214
- This usually indicates a stockout/quota issue. Verify your project and chosen `cluster_location` have L4 capacity.
220
215
- GCS bucket already exists:
221
216
- GCS bucket names have to be globally unique, pick a different name with a random suffix.
222
217
- Cloud SQL instance already exists:
223
218
- Ensure the `cloudsql_instance` name doesn't already exist in your project.
224
219
225
-
5. Troubleshoot `terraform destroy` failures:
220
+
4. Troubleshoot `terraform destroy` failures:
226
221
- Network deletion issue:
227
222
-`terraform destroy` fails to delete the network due to a known issue in the GCP provider. For now, the workaround is to manually delete it.
description: Configure the <a href="https://developers.google.com/workspace/guides/configure-oauth-consent#configure_oauth_consent"><i>OAuth Consent Screen</i></a> for your project. Ensure <b>User type</b> is set to <i>Internal</i>. Note that by default, only users within your organization can be allowlisted. To add external users, change the <b>User type</b> to <i>External</i> after the application is deployed.
0 commit comments