Skip to content

Commit e80d614

Browse files
cgardenstuliren
andauthored
📖 Clarify staging setup guide for bq & gcs destination (#9255)
* clarify confusing parts of setting up staging for bq destination * Added Storage Admin * update gcs destination docs too * fix indentation * Update required permission list Co-authored-by: Liren Tu <[email protected]>
1 parent 80695ad commit e80d614

File tree

2 files changed

+26
-12
lines changed

2 files changed

+26
-12
lines changed

docs/integrations/destinations/bigquery.md

Lines changed: 13 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -111,15 +111,22 @@ This is the recommended configuration for uploading data to BigQuery. It works b
111111
* **GCS Bucket Path**
112112
* **Block Size (MB) for GCS multipart upload**
113113
* **GCS Bucket Keep files after migration**
114-
* See [this](https://cloud.google.com/storage/docs/creating-buckets) for instructions on how to create a GCS bucket.
114+
* See [this](https://cloud.google.com/storage/docs/creating-buckets) for instructions on how to create a GCS bucket. The bucket cannot have a retention policy. Set Protection Tools to none or Object versioning.
115115
* **HMAC Key Access ID**
116-
* See [this](https://cloud.google.com/storage/docs/authentication/hmackeys) on how to generate an access key.
117-
* We recommend creating an Airbyte-specific user or service account. This user or account will require read and write permissions to objects in the bucket.
116+
* See [this](https://cloud.google.com/storage/docs/authentication/managing-hmackeys) on how to generate an access key. For more information on hmac keys please reference the [GCP docs](https://cloud.google.com/storage/docs/authentication/hmackeys)
117+
* We recommend creating an Airbyte-specific user or service account. This user or account will require the following permissions for the bucket:
118+
```
119+
storage.multipartUploads.abort
120+
storage.multipartUploads.create
121+
storage.objects.create
122+
storage.objects.delete
123+
storage.objects.get
124+
storage.objects.list
125+
```
126+
You can set those by going to the permissions tab in the GCS bucket and adding the appropriate the email address of the service account or user and adding the aforementioned permissions.
118127
* **Secret Access Key**
119128
* Corresponding key to the above access ID.
120-
* Make sure your GCS bucket is accessible from the machine running Airbyte.
121-
* This depends on your networking setup.
122-
* The easiest way to verify if Airbyte is able to connect to your GCS bucket is via the check connection tool in the UI.
129+
* Make sure your GCS bucket is accessible from the machine running Airbyte. This depends on your networking setup. The easiest way to verify if Airbyte is able to connect to your GCS bucket is via the check connection tool in the UI.
123130
124131
### `Standard` uploads
125132
This uploads data directly from your source to BigQuery. While this is faster to setup initially, **we strongly recommend that you do not use this option for anything other than a quick demo**. It is more than 10x slower than the GCS uploading option and will fail for many datasets. Please be aware you may see some failures for big datasets and slow sources, e.g. if reading from source takes more than 10-12 hours. This is caused by the Google BigQuery SDK client limitations. For more details please check [https://github.com/airbytehq/airbyte/issues/3549](https://github.com/airbytehq/airbyte/issues/3549)

docs/integrations/destinations/gcs.md

Lines changed: 13 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -207,16 +207,23 @@ Under the hood, an Airbyte data stream in Json schema is first converted to an A
207207

208208
* Fill up GCS info
209209
* **GCS Bucket Name**
210-
* See [this](https://cloud.google.com/storage/docs/creating-buckets) to create an S3 bucket.
210+
* See [this](https://cloud.google.com/storage/docs/creating-buckets) for instructions on how to create a GCS bucket. The bucket cannot have a retention policy. Set Protection Tools to none or Object versioning.
211211
* **GCS Bucket Region**
212212
* **HMAC Key Access ID**
213-
* See [this](https://cloud.google.com/storage/docs/authentication/hmackeys) on how to generate an access key.
214-
* We recommend creating an Airbyte-specific user or service account. This user or account will require read and write permissions to objects in the bucket.
213+
* See [this](https://cloud.google.com/storage/docs/authentication/managing-hmackeys) on how to generate an access key. For more information on hmac keys please reference the [GCP docs](https://cloud.google.com/storage/docs/authentication/hmackeys)
214+
* We recommend creating an Airbyte-specific user or service account. This user or account will require the following permissions for the bucket:
215+
```
216+
storage.multipartUploads.abort
217+
storage.multipartUploads.create
218+
storage.objects.create
219+
storage.objects.delete
220+
storage.objects.get
221+
storage.objects.list
222+
```
223+
You can set those by going to the permissions tab in the GCS bucket and adding the appropriate the email address of the service account or user and adding the aforementioned permissions.
215224
* **Secret Access Key**
216225
* Corresponding key to the above access ID.
217-
* Make sure your GCS bucket is accessible from the machine running Airbyte.
218-
* This depends on your networking setup.
219-
* The easiest way to verify if Airbyte is able to connect to your GCS bucket is via the check connection tool in the UI.
226+
* Make sure your GCS bucket is accessible from the machine running Airbyte. This depends on your networking setup. The easiest way to verify if Airbyte is able to connect to your GCS bucket is via the check connection tool in the UI.
220227
221228
## CHANGELOG
222229

0 commit comments

Comments
 (0)