Skip to content

update Snowflake destination docs with more info #10213

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 16, 2022

Conversation

subodh1810
Copy link
Contributor

Andy suggested these improvements to our Snowflake documentation

@subodh1810 subodh1810 self-assigned this Feb 9, 2022
@github-actions github-actions bot added the area/documentation Improvements or additions to documentation label Feb 9, 2022
@subodh1810 subodh1810 temporarily deployed to more-secrets February 9, 2022 10:40 Inactive
@subodh1810 subodh1810 temporarily deployed to more-secrets February 9, 2022 10:43 Inactive
### Requirements

1. Active Snowflake warehouse
2. A staging S3 or GCS bucket with credentials \(for the Cloud Storage Staging strategy\).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should remove this since everyone should use internal staging

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, if a user doesn't have an S3 or GCS staging set up, they should be able to move forward simply with an active Snowflake warehouse and not changing the loading method. Is that correct? If so so, agree with Sherif's comment here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes thats correct @misteryeo

Copy link
Contributor

@misteryeo misteryeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Added a few comments - happy to continue discussing too. Also looped in Amruta as an FYI so she can see how things are changing with docs.


By default, Airbyte uses batches of `INSERT` commands to add data to a temporary table before copying it over to the final table in Snowflake. This is too slow for larger/multi-GB replications. For those larger replications we recommend configuring using cloud storage to allow batch writes and loading.
By default, Airbyte uses `INTERNAL STAGING`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we hyperlink this to the correct section in our docs in case folks want to understand more about this?

* **Username**
* **Password**
* **JDBC URL Params** (Optional)
* **Host** : The host domain of the snowflake instance (must include the account, region, cloud environment, and end with snowflakecomputing.com). Example - `accountname.us-east-2.aws.snowflakecomputing.com`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this section of descriptions, are external docs from Snowflake that can/should be referenced?

For example, with "Role", this seems like it's appropriate: https://docs.snowflake.com/en/user-guide/security-access-control-overview.html#roles.


### Internal Staging

Internal named stages are storage location objects within a Snowflake database/schema. Because they are database objects, the same security permissions apply as with any other database objects. No need to provide additional properties for internal staging
Internal named stages are storage location objects within a Snowflake database/schema. Because they are database objects, the same security permissions apply as with any other database objects. No need to provide additional properties for internal staging. This is also the recommended way of using the connector.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why we recommend this and why this is default?

### Requirements

1. Active Snowflake warehouse
2. A staging S3 or GCS bucket with credentials \(for the Cloud Storage Staging strategy\).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, if a user doesn't have an S3 or GCS staging set up, they should be able to move forward simply with an active Snowflake warehouse and not changing the loading method. Is that correct? If so so, agree with Sherif's comment here.

We recommend creating an Airbyte-specific warehouse, database, schema, user, and role for writing data into Snowflake so it is possible to track costs specifically related to Airbyte \(including the cost of running this warehouse\) and control permissions at a granular level. Since the Airbyte user creates, drops, and alters tables, `OWNERSHIP` permissions are required in Snowflake. If you are not following the recommended script below, please limit the `OWNERSHIP` permissions to only the necessary database and schema for the Airbyte user.

We provide the following script to create these resources. Before running, you must change the password to something secure. You may change the names of the other resources if you desire.
Login into your Snowflake warehouse, copy and paste the following script in a new [worksheet](https://docs.snowflake.com/en/user-guide/ui-worksheet.html). Select the `All Queries` checkbox and then press the `Run` button.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Love that we're hyperlinking out to Snowflake docs here!

@subodh1810 subodh1810 temporarily deployed to more-secrets February 14, 2022 17:07 Inactive
@subodh1810 subodh1810 temporarily deployed to more-secrets February 14, 2022 17:07 Inactive
@subodh1810 subodh1810 requested a review from misteryeo February 14, 2022 17:08
Copy link
Contributor

@misteryeo misteryeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@subodh1810 subodh1810 merged commit 531ed1a into master Feb 16, 2022
@subodh1810 subodh1810 deleted the update-snowflake-destination-docs branch February 16, 2022 11:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants