Skip to content

Commit 5b6a225

Browse files
authored
Destination AWS Datalake: documentation update to match Airbyte template (#13716)
* Documentation update to match Airbyte template * Update AWS Datalake doc
1 parent 29fc124 commit 5b6a225

File tree

1 file changed

+39
-69
lines changed

1 file changed

+39
-69
lines changed
Lines changed: 39 additions & 69 deletions
Original file line numberDiff line numberDiff line change
@@ -1,74 +1,38 @@
11
# AWS Datalake
22

3-
## Overview
3+
This page contains the setup guide and reference information for the AWS Datalake destination connector.
44

55
The AWS Datalake destination connector allows you to sync data to AWS. It will write data as JSON files in S3 and
6-
update the Glue data catalog so that the data is available throughout other AWS services such as Athena, Glue jobs, EMR,
7-
Redshift, etc.
6+
will make it available through a [Lake Formation Governed Table](https://docs.aws.amazon.com/lake-formation/latest/dg/governed-tables.html) in the Glue Data Catalog so that the data is available throughout other AWS services such as Athena, Glue jobs, EMR, Redshift, etc.
87

9-
### Sync overview
10-
#### Output schema
11-
12-
The Glue tables will be created with schema information provided by the source, i.e : You will find the same columns
13-
and types in the destination table as in the source.
14-
15-
#### Features
16-
17-
| Feature | Supported?\(Yes/No\) | Notes |
18-
| :--- | :--- | :--- |
19-
| Full Refresh Sync | Yes | |
20-
| Incremental - Append Sync | Yes | |
21-
| Namespaces | No | |
22-
23-
## Getting started
24-
### Requirements
8+
## Prerequisites
259

2610
To use this destination connector, you will need:
27-
* A AWS account
28-
* A S3 bucket where the data will be written
29-
* A AWS Lake Formation database where tables will be created (one per stream)
11+
* An AWS account
12+
* An S3 bucket where the data will be written
13+
* An AWS Lake Formation database where tables will be created (one per stream)
3014
* AWS credentials in the form of either the pair Access key ID / Secret key ID or a role with the following permissions:
3115

3216
* Writing objects in the S3 bucket
3317
* Updating of the Lake Formation database
3418

35-
See the setup guide for more information about the creation of the resources.
36-
37-
### Setup guide
38-
#### Creating an AWS account
39-
40-
Feel free to skip this section if you already have an AWS account.
41-
42-
You will find the instructions to setup a new AWS account [here](https://aws.amazon.com/premiumsupport/knowledge-center/create-and-activate-aws-account/).
43-
44-
#### Creating an S3 bucket
45-
46-
Feel free to skip this section if you already have an S3 bucket.
47-
48-
You will find the instructions to create an S3 bucket [here](https://docs.aws.amazon.com/AmazonS3/latest/userguide/create-bucket-overview.html).
49-
50-
#### Creating a Lake Formation Database
51-
52-
Feel free to skip this section if you already have a Lake Formation Database.
53-
54-
You will find the instructions to create a Lakeformation Database [here](https://docs.aws.amazon.com/lake-formation/latest/dg/creating-database.html).
55-
56-
#### Creating Credentials
57-
58-
The AWS Datalake connector lets you authenticate with either a user or a role. In both case, you will have to make sure
59-
that appropriate policies are in place.
60-
61-
Feel free to skip this section if you already have appropriate credentials.
62-
63-
**Option 1: Creating a user**
19+
Please check the Setup guide below if you need guidance creating those.
6420

65-
You will find the instructions to create a user [here](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html).
66-
Make sure to select "Programmatic Access" so that you get secret access keys.
21+
## Setup guide
6722

23+
You should now have all the requirements needed to configure AWS Datalake as a destination in the UI. You'll need the
24+
following information to configure the destination:
6825

69-
**Option 2: Creating a role**
70-
71-
You will find the instructions to create a role [here](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create_for-service.html).
26+
- Aws Account Id : The account ID of your AWS account. You will find the instructions to setup a new AWS account [here](https://aws.amazon.com/premiumsupport/knowledge-center/create-and-activate-aws-account/).
27+
- Aws Region : The region in which your resources are deployed
28+
- Authentication mode : The AWS Datalake connector lets you authenticate with either a user or a role. In both case, you will have to make sure
29+
that appropriate policies are in place. Select "ROLE" if you are using a role, "USER" if using a user with Access key / Secret Access key.
30+
- Target Role Arn : The name of the role, if "Authentication mode" was "ROLE". You will find the instructions to create a new role [here](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create_for-service.html).
31+
- Access Key Id : The Access Key ID of the user if "Authentication mode" was "USER". You will find the instructions to create a new user [here](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html). Make sure to select "Programmatic Access" so that you get secret access keys.
32+
- Secret Access Key : The Secret Access Key ID of the user if "Authentication mode" was "USER"
33+
- S3 Bucket Name : The bucket in which the data will be written. You will find the instructions to create a new S3 bucket [here](https://docs.aws.amazon.com/AmazonS3/latest/userguide/create-bucket-overview.html).
34+
- Target S3 Bucket Prefix : A prefix to prepend to the file name when writing to the bucket
35+
- Database : The database in which the tables will be created. You will find the instructions to create a new Lakeformation Database [here](https://docs.aws.amazon.com/lake-formation/latest/dg/creating-database.html).
7236

7337
**Assigning proper permissions**
7438

@@ -80,24 +44,30 @@ The policy used by the user or the role must have access to the following servic
8044

8145
You can use [the AWS policy generator](https://awspolicygen.s3.amazonaws.com/policygen.html) to help you generate an appropriate policy.
8246

83-
Please also make sure that the role or user you will use has appropriate permissions on the database in AWS Lakeformation.
47+
Please also make sure that the role or user you will use has appropriate permissions on the database in AWS Lakeformation. You will find more information about Lake Formation permissions in the [AWS Lake Formation Developer Guide](https://docs.aws.amazon.com/lake-formation/latest/dg/lake-formation-permissions.html).
8448

85-
### Setup the AWS Datalake destination in Airbyte
49+
## Supported sync modes
8650

87-
You should now have all the requirements needed to configure AWS Datalake as a destination in the UI. You'll need the
88-
following information to configure the destination:
51+
| Feature | Supported?\(Yes/No\) | Notes |
52+
| :--- | :--- | :--- |
53+
| Full Refresh Sync | Yes | |
54+
| Incremental - Append Sync | Yes | |
55+
| Namespaces | No | |
56+
57+
58+
## Data type map
59+
60+
The Glue tables will be created with schema information provided by the source, i.e : You will find the same columns
61+
and types in the destination table as in the source except for the following types which will be translated for compatibility with the Glue Data Catalog:
62+
63+
|Type in the source| Type in the destination|
64+
| :--- | :--- |
65+
| number | float |
66+
| integer | int |
8967

90-
- Aws Account Id : The account ID of your AWS account
91-
- Aws Region : The region in which your resources are deployed
92-
- Authentication mode : "ROLE" if you are using a role, "USER" if using a user with Access key / Secret Access key
93-
- Target Role Arn : The name of the role, if "Authentication mode" was "ROLE"
94-
- Access Key Id : The Access Key ID of the user if "Authentication mode" was "USER"
95-
- Secret Access Key : The Secret Access Key ID of the user if "Authentication mode" was "USER"
96-
- S3 Bucket Name : The bucket in which the data will be written
97-
- Target S3 Bucket Prefix : A prefix to prepend to the file name when writing to the bucket
98-
- Database : The database in which the tables will be created
9968

10069

10170
## Changelog
71+
10272
| 0.1.1 | 2022-04-20 | [\#11811](https://github.com/airbytehq/airbyte/pull/11811) | Fix name of required param in specification |
10373
| 0.1.0 | 2022-03-29 | [\#10760](https://github.com/airbytehq/airbyte/pull/10760) | Initial release |

0 commit comments

Comments
 (0)