|
3 | 3 | This repository contains a Terraform template for running [Ray](https://www.ray.io/) on Google Kubernetes Engine.
|
4 | 4 | See the [Ray on GKE](/ray-on-gke/) directory to see additional guides and references.
|
5 | 5 |
|
| 6 | +## Prerequisites |
| 7 | + |
| 8 | +1. GCP Project with following APIs enabled |
| 9 | + - container.googleapis.com |
| 10 | + - iap.googleapis.com (required when using authentication with Identity Aware Proxy) |
| 11 | + |
| 12 | +2. A functional GKE cluster. |
| 13 | + - To create a new standard or autopilot cluster, follow the instructions in [`infrastructure/README.md`](https://github.com/GoogleCloudPlatform/ai-on-gke/blob/main/infrastructure/README.md) |
| 14 | + - Alternatively, you can set the `create_cluster` variable to true in `workloads.tfvars` to provision a new GKE cluster. This will default to creating a GKE Autopilot cluster; if you want to provision a standard cluster you must also set `autopilot_cluster` to false. |
| 15 | + |
| 16 | +3. This module is configured to optionally use Identity Aware Proxy (IAP) to protect access to the Ray dashboard. It expects the brand & the OAuth consent configured in your org. You can check the details here: [OAuth consent screen](https://console.cloud.google.com/apis/credentials/consent) |
| 17 | + |
| 18 | +4. Preinstall the following on your computer: |
| 19 | + * Terraform |
| 20 | + * Gcloud CLI |
| 21 | + |
6 | 22 | ## Installation
|
7 | 23 |
|
8 |
| -Preinstall the following on your computer: |
9 |
| -* Terraform |
10 |
| -* Gcloud |
| 24 | +### Configure Inputs |
11 | 25 |
|
12 |
| -> **_NOTE:_** Terraform keeps state metadata in a local file called `terraform.tfstate`. Deleting the file may cause some resources to not be cleaned up correctly even if you delete the cluster. We suggest using `terraform destory` before reapplying/reinstalling. |
| 26 | +1. If needed, clone the repo |
| 27 | +``` |
| 28 | +git clone https://github.com/GoogleCloudPlatform/ai-on-gke |
| 29 | +cd ai-on-gke/applications/ray |
| 30 | +``` |
13 | 31 |
|
14 |
| -1. If needed, git clone https://github.com/GoogleCloudPlatform/ai-on-gke |
| 32 | +2. Edit `workloads.tfvars` with your GCP settings. |
| 33 | + |
| 34 | +**Important Note:** |
| 35 | +If using this with the Jupyter module (`applications/jupyter/`), it is recommended to use the same k8s namespace |
| 36 | +for both i.e. set this to the same namespace as `applications/jupyter/workloads.tfvars`. |
| 37 | + |
| 38 | +| Variable | Description | Required | |
| 39 | +|-----------------------------|----------------------------------------------------------------------------------------------------------------|:--------:| |
| 40 | +| project_id | GCP Project Id | Yes | |
| 41 | +| cluster_name | GKE Cluster Name | Yes | |
| 42 | +| cluster_location | GCP Region | Yes | |
| 43 | +| kubernetes_namespace | The namespace that Ray and rest of the other resources will be installed in. | Yes | |
| 44 | +| gcs_bucket | GCS bucket to be used for Ray storage | Yes | |
| 45 | +| create_service_account | Create service accounts used for Workload Identity mapping | Yes | |
| 46 | + |
| 47 | + |
| 48 | +### Install |
| 49 | + |
| 50 | +> **_NOTE:_** Terraform keeps state metadata in a local file called `terraform.tfstate`. Deleting the file may cause some resources to not be cleaned up correctly even if you delete the cluster. We suggest using `terraform destory` before reapplying/reinstalling. |
15 | 51 |
|
16 |
| -2. `cd applications/ray` |
| 52 | +3. Ensure your gcloud application default credentials are in place. |
| 53 | +``` |
| 54 | +gcloud auth application-default login |
| 55 | +``` |
17 | 56 |
|
18 |
| -3. Find the name and location of the GKE cluster you want to use. |
19 |
| - Run `gcloud container clusters list --project=<your GCP project>` to see all the available clusters. |
20 |
| - _Note: If you created the GKE cluster via the infrastructure repo, you can get the cluster info from `platform.tfvars`_ |
| 57 | +4. Run `terraform init` |
21 | 58 |
|
22 |
| -4. Edit `workloads.tfvars` with your environment specific variables and configurations. |
| 59 | +5. Run `terraform apply --var-file=./workloads.tfvars`. |
23 | 60 |
|
24 |
| -5. Run `terraform init && terraform apply --var-file workloads.tfvars` |
|
0 commit comments