You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: applications/rag/README.md
+11-17
Original file line number
Diff line number
Diff line change
@@ -31,19 +31,17 @@ Install the following on your computer:
31
31
32
32
### Bring your own cluster (optional)
33
33
34
-
By default, this tutorial creates a Standard cluster on your behalf. We highly recommend following the default settings.
34
+
By default, this tutorial creates a cluster on your behalf. We highly recommend following the default settings.
35
35
36
36
If you prefer to manage your own cluster, set `create_cluster = false` in the [Installation section](#installation). Creating a long-running cluster may be better for development, allowing you to iterate on Terraform components without recreating the cluster every time.
37
37
38
-
Use the provided infrastructue module to create a cluster:
38
+
Use gcloud to create a GKE Autopilot cluster. Note that RAG requires the latest Autopilot features, available on the latest versions of 1.28 and 1.29.
39
39
40
-
1.`cd ai-on-gke/infrastructure`
41
-
42
-
2. Edit `platform.tfvars` to set your project ID, location and cluster name. The other fields are optional. Ensure you create an L4 nodepool as this tutorial requires it.
- If the JupyterHub job fails to start the proxy with error code 599, it is likely an known issue with Cloud DNS, which occurs when a cluster is quickly deleted and recreated with the same name.
199
-
- Recreate the cluster with a different name or wait several minutes after running `terraform destroy` before running `terraform apply`.
200
-
201
-
2. Troubleshoot Ray job failures:
195
+
1. Troubleshoot Ray job failures:
202
196
- If the Ray actors fail to be scheduled, it could be due to a stockout or quota issue.
203
197
- Run `kubectl get pods -n ${NAMESPACE} -l app.kubernetes.io/name=kuberay`. There should be a Ray head and Ray worker pod in `Running` state. If your ray pods aren't running, it's likely due to quota or stockout issues. Check that your project and selected `cluster_location` have L4 GPU capacity.
204
198
- Often, retrying the Ray job submission (the last cell of the notebook) helps.
205
199
- The Ray job may take 15-20 minutes to run the first time due to environment setup.
206
200
207
-
3. Troubleshoot IAP login issues:
201
+
2. Troubleshoot IAP login issues:
208
202
- Verify the cert is Active:
209
203
- For JupyterHub `kubectl get managedcertificates jupyter-managed-cert -n ${NAMESPACE} --output jsonpath='{.status.domainStatus[0].status}'`
210
204
- For the frontend: `kubectl get managedcertificates frontend-managed-cert -n rag --output jsonpath='{.status.domainStatus[0].status}'`
- The [OAuth Consent Screen](https://developers.google.com/workspace/guides/configure-oauth-consent#configure_oauth_consent) has `User type` set to `Internal` by default, which means principals external to the org your project is in cannot log in. To add external principals, change `User type` to `External`.
216
210
217
-
4. Troubleshoot `terraform apply` failures:
211
+
3. Troubleshoot `terraform apply` failures:
218
212
- Inference server (`mistral`) fails to deploy:
219
213
- This usually indicates a stockout/quota issue. Verify your project and chosen `cluster_location` have L4 capacity.
220
214
- GCS bucket already exists:
221
215
- GCS bucket names have to be globally unique, pick a different name with a random suffix.
222
216
- Cloud SQL instance already exists:
223
217
- Ensure the `cloudsql_instance` name doesn't already exist in your project.
224
218
225
-
5. Troubleshoot `terraform destroy` failures:
219
+
4. Troubleshoot `terraform destroy` failures:
226
220
- Network deletion issue:
227
221
-`terraform destroy` fails to delete the network due to a known issue in the GCP provider. For now, the workaround is to manually delete it.
description: Configure the <a href="https://developers.google.com/workspace/guides/configure-oauth-consent#configure_oauth_consent"><i>OAuth Consent Screen</i></a> for your project. Ensure <b>User type</b> is set to <i>Internal</i>. Note that by default, only users within your organization can be allowlisted. To add external users, change the <b>User type</b> to <i>External</i> after the application is deployed.
0 commit comments