-
Notifications
You must be signed in to change notification settings - Fork 4.6k
Tutorial and documentation for config-based connectors #15027
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 1 commit
Commits
Show all changes
99 commits
Select commit
Hold shift + click to select a range
4855a72
5-step tutorial
girarda 138bd52
move
girarda 637c2a7
tiny bit of editing
girarda 9fabad8
Merge branch 'master' into alex/lowcodeTutorial
girarda ff775e3
Update tutorial
girarda 6ebee74
update docs
girarda ff2b602
reset
girarda 906f915
move files
girarda a64c758
record selector, request options, and more links
girarda 2099b24
update
girarda 8bab845
update
girarda a03e6f3
connector definition
girarda d444d74
link
girarda 71e0a5b
links
girarda a9512ab
Merge branch 'master' into alex/lowcodeTutorial
girarda 7b36ca6
update example
girarda 218bfd9
footnote
girarda 78599e3
typo
girarda c9bfb99
document string interpolation
girarda 58567c5
note on string interpolation
girarda 76a95ae
update
girarda 8feb1a6
Merge branch 'master' into alex/lowcodeTutorial
girarda ecf9b34
fix code sample
girarda 990d44a
fix
girarda f9b1b68
update sample
girarda 945cc3e
fix
girarda a3349df
use the actual config
girarda 318e613
Update as per comments
girarda c54b0c4
Merge branch 'master' into alex/lowcodeTutorial
girarda 9cc1e4b
write as yaml
girarda f096296
typo
girarda 8bd35b4
Clarify options overloading
girarda cfb4528
clarify that docker must be running
girarda 85d5afb
remove extra footnote
girarda 61a75b5
use venv directly
girarda 7e1dc95
Apply suggestions from code review
girarda 3df5071
signup instructions
girarda b074832
update
girarda 672eb16
clarify that both dot and bracket notations are interchangeable
girarda 6575c9d
Clarify how check works
girarda e747b4a
create spec and config before updating connector definition
girarda d5ac31d
clarify what now_local() is
girarda fdce2c6
rename to yaml structure
girarda 198b421
Go through tutorial and update end of section code samples
girarda 18bc40f
fix link
girarda f4e5ed4
update
girarda 83e3845
update code samples
girarda bab017b
Update code samples
girarda 37d1fde
Update to bracket notation
girarda 1fc83fe
remove superfluous comments
girarda 6944916
Update docs/connector-development/config-based/tutorial/2-install-dep…
girarda c317a49
Update docs/connector-development/config-based/tutorial/3-connecting-…
girarda 096a370
Update docs/connector-development/config-based/tutorial/3-connecting-…
girarda ff804be
Update docs/connector-development/config-based/tutorial/3-connecting-…
girarda 49be031
Update docs/connector-development/config-based/tutorial/3-connecting-…
girarda 6790422
Update docs/connector-development/config-based/tutorial/3-connecting-…
girarda 34214b4
Update docs/connector-development/config-based/tutorial/4-reading-dat…
girarda bf9a205
fix path
girarda 74e4de8
update
girarda ca0f93c
motivation blurp
girarda 46b2ee4
Merge branch 'master' into alex/lowcodeTutorial
girarda 9cfd223
warning
girarda 65a966c
warning
girarda dd4437c
fix code block
girarda 365c0dc
update code samples
girarda ebaa701
update code sample
girarda aacc30a
update code samples
girarda 3b1e85f
small updates
girarda b4498f3
update yaml structure
girarda 306e9e5
custom class example
girarda c2d9b86
language annotations
girarda 562844b
update warning
girarda faada9a
Merge branch 'master' into alex/lowcodeTutorial
girarda 08487f7
Update tutorial to use dpath extractor
girarda 30d25c0
Update record selector docs
girarda 63a295c
unit test
girarda 019cc0a
link to contributing
girarda 117ee2f
tiny update
girarda 3a00dac
$ in front of commands
girarda b2040fc
$ in front of commands
girarda db243a8
More readings
girarda cc0d76c
link to existing config-based connectors
girarda 6cbdaa0
index
girarda 619bf37
update
girarda 9a4f1c9
delete broken link
girarda 5337868
supported features
girarda e4919d5
update
girarda e27fade
Add some links
girarda 048bddb
Update docs/connector-development/config-based/overview.md
girarda 019aad2
Update docs/connector-development/config-based/record-selector.md
girarda cc308f2
Update docs/connector-development/config-based/overview.md
girarda e3cffc8
Update docs/connector-development/config-based/overview.md
girarda 785a3e4
Update docs/connector-development/config-based/overview.md
girarda 2db7694
mention the unit
girarda 2a7d5fc
headers
girarda eba0322
remove mentions of interpolating on stream slice, etc.
girarda e7f0023
Merge branch 'master' into alex/lowcodeTutorial
girarda 8353121
update
girarda e6637d3
exclude config-based docs
girarda File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
42 changes: 42 additions & 0 deletions
42
...onnector-development/tutorials/cdk-api-source/config-based/0-getting-started.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
# Getting Started | ||
|
||
## Summary | ||
|
||
Throughout this tutorial, we'll walk you through the creation an Airbyte source to read data from an HTTP API. | ||
|
||
We'll build a connector reading data from the Exchange Rates API, but the steps we'll go through will apply to other HTTP APIs you might be interested in integrating with. | ||
|
||
The API documentations can be found [here](https://exchangeratesapi.io/documentation/). | ||
In this tutorial, we will read data from the following endpoints: | ||
|
||
- `Latest Rates Endpoint` | ||
- `Historical Rates Endpoint` | ||
|
||
With the end goal of implementing a Source with a single `Stream` containing exchange rates going from a base currency to many other currencies. | ||
The output schema of our stream will look like | ||
|
||
```json | ||
{ | ||
"base": "USD", | ||
"date": "2022-07-15", | ||
"rates": { | ||
"CAD": 1.28, | ||
"EUR": 0.98 | ||
} | ||
} | ||
``` | ||
|
||
## Exchange Rates API Setup | ||
|
||
Before we can get started, you'll need to generate an API access key for the Exchange Rates API. | ||
This can be done by signing up for the Free tier plan on [Exchange Rates API](https://exchangeratesapi.io/). | ||
|
||
## Requirements | ||
|
||
- Python >= 3.9 | ||
- Docker | ||
- NodeJS | ||
|
||
## Next Steps | ||
|
||
Next, we'll [create a Source using the connector generator.](./1-create-source.md) |
27 changes: 27 additions & 0 deletions
27
.../connector-development/tutorials/cdk-api-source/config-based/1-create-source.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
# Step 1: Create the Source | ||
|
||
Let's start by cloning the Airbyte repository | ||
|
||
``` | ||
git clone [email protected]:airbytehq/airbyte.git | ||
``` | ||
|
||
Airbyte provides a code generator which bootstraps the scaffolding for our connector. | ||
|
||
``` | ||
cd airbyte-integrations/connector-templates/generator | ||
./generate.sh | ||
``` | ||
|
||
This will bring up an interactive helper application. Use the arrow keys to pick a template from the list. Select the `Configuration Based Source` template and then input the name of your connector. The application will create a new directory in `airbyte/airbyte-integrations/connectors/` with the name of your new connector. | ||
|
||
``` | ||
Configuration Based Source | ||
Source name: exchange-rates-tutorial | ||
``` | ||
|
||
For this walkthrough, we'll refer to our source as `exchange-rates-tutorial`. The complete source code for this tutorial can be found here <FIXME: there should be a link to the complete tutorial? | ||
|
||
## Next steps | ||
|
||
Next, [we'll install dependencies required to run the connector](./2-install-dependencies.md) |
45 changes: 45 additions & 0 deletions
45
...tor-development/tutorials/cdk-api-source/config-based/2-install-dependencies.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
# Step 2: Install dependencies | ||
|
||
Let's create a python virtual environment for our source. | ||
You can do this by executing the following commands from the root of the Airbyte repository. | ||
|
||
The command below assume that `python` points to a version of python >=3.9.0. On some systems, `python` points to a Python2 installation and `python3` points to Python3. | ||
If this is the case on your machine, substitute the `python` commands with `python3`. | ||
The subsequent `python` invocations will use the virtual environment created for the connector. | ||
|
||
``` | ||
python tools/bin/update_intellij_venv.py -modules source-exchange-rates-tutorial --install-venv | ||
cd airbyte-integrations/connectors/source-exchange-rates-tutorial | ||
source .venv/bin/activate | ||
``` | ||
|
||
These steps create an initial python environment, and install the dependencies required to run an API Source connector. | ||
|
||
Let's verify everything works as expected by running the Airbyte `spec` operation: | ||
|
||
``` | ||
python main.py spec | ||
|
||
``` | ||
|
||
You should see an output similar to the one below: | ||
|
||
``` | ||
{"type": "SPEC", "spec": {"documentationUrl": "https://docsurl.com", "connectionSpecification": {"$schema": "http://json-schema.org/draft-07/schema#", "title": "Python Http Tutorial Spec", "type": "object", "required": ["TODO"], "additionalProperties": false, "properties": {"TODO: This schema defines the configuration required for the source. This usually involves metadata such as database and/or authentication information.": {"type": "string", "description": "describe me"}}}}} | ||
``` | ||
|
||
More details on the `spec` operation can be found in [Basic Concepts](https://docs.airbyte.com/connector-development/cdk-python/basic-concepts) and [Defining Stream Schemas](https://docs.airbyte.com/connector-development/cdk-python/schemas), but this is a simple sanity check to make sure everything is wired up correctly. | ||
|
||
For now, note that the `main.py` file is a convenience wrapper to help run the connector. | ||
Its invocation format is `python main.py <command> [args]`. | ||
The module's generated `README.md` contains more details on the supported commands. | ||
|
||
## Next steps | ||
|
||
Next, we'll [connect to the API source](./3-connecting.md) | ||
|
||
## More readings | ||
|
||
- [Basic Concepts](https://docs.airbyte.com/connector-development/cdk-python/basic-concepts) | ||
- [Defining Stream Schemas](https://docs.airbyte.com/connector-development/cdk-python/schemas) | ||
- The module's generated `README.md` contains more details on the supported commands. |
180 changes: 180 additions & 0 deletions
180
docs/connector-development/tutorials/cdk-api-source/config-based/3-connecting.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,180 @@ | ||
# Step 3: Connecting to the API | ||
|
||
We're now ready to start implementing the connector. | ||
|
||
The code generator already created a boilerplate connector definition in `source-exchange-rates-tutorial/source_exchange_rates_tutorial/exchange_rates_tutorial.yaml` | ||
|
||
``` | ||
schema_loader: | ||
type: JsonSchema | ||
file_path: "./source_exchange_rates_tutorial/schemas/{{ name }}.json" | ||
selector: | ||
type: RecordSelector | ||
extractor: | ||
type: JelloExtractor | ||
transform: "_" | ||
requester: | ||
type: HttpRequester | ||
name: "{{ options['name'] }}" | ||
url_base: TODO "your_api_base_url" | ||
http_method: "GET" | ||
authenticator: | ||
type: TokenAuthenticator | ||
token: "{{ config['api_key'] }}" | ||
retriever: | ||
type: SimpleRetriever | ||
name: "{{ options['name'] }}" | ||
primary_key: "{{ options['primary_key'] }}" | ||
record_selector: | ||
ref: "*ref(selector)" | ||
paginator: | ||
type: NoPagination | ||
state: | ||
class_name: airbyte_cdk.sources.declarative.states.dict_state.DictState | ||
customers_stream: | ||
type: DeclarativeStream | ||
options: | ||
name: "customers" | ||
primary_key: "id" | ||
schema_loader: | ||
ref: "*ref(schema_loader)" | ||
retriever: | ||
ref: "*ref(retriever)" | ||
requester: | ||
ref: "*ref(requester)" | ||
path: TODO "your_endpoint_path" | ||
streams: | ||
- "*ref(customers_stream)" | ||
check: | ||
type: CheckStream | ||
stream_names: ["customers_stream"] | ||
``` | ||
|
||
Let's fill this out these TODOs with the information found in the exchange rates api docs https://exchangeratesapi.io/documentation/ | ||
|
||
1. First, let's rename the stream from `customers` to `rates. | ||
|
||
``` | ||
rates_stream: | ||
type: DeclarativeStream | ||
options: | ||
name: "rates" | ||
``` | ||
|
||
and update the references in the streams list and check block | ||
|
||
``` | ||
streams: | ||
- "*ref(rates_stream)" | ||
check: | ||
type: CheckStream | ||
stream_names: ["rates_stream"] | ||
``` | ||
|
||
2. Next we'll set the base url. | ||
According to the API documentation, the base url is "https://api.exchangeratesapi.io/v1/". | ||
This can be set in the requester definition. | ||
|
||
``` | ||
requester: | ||
type: HttpRequester | ||
name: "{{ options['name'] }}" | ||
url_base: "https://api.exchangeratesapi.io/v1/" | ||
``` | ||
|
||
3. We can fetch the latest data by submitting a request to "/latest". This path is specific to the stream, so we'll set within the `rates_stream` definition. | ||
|
||
``` | ||
rates_stream: | ||
type: DeclarativeStream | ||
options: | ||
name: "rates" | ||
primary_key: "id" | ||
schema_loader: | ||
ref: "*ref(schema_loader)" | ||
retriever: | ||
ref: "*ref(retriever)" | ||
requester: | ||
ref: "*ref(requester)" | ||
path: "/latest" | ||
``` | ||
|
||
4. Next, we'll set up the authentication. | ||
The Exchange Rates API requires an access key, which we'll need to make accessible to our connector. | ||
We'll configure the connector to use this access key by setting the access key in a request parameter and pointing to a field in the config, which we'll populate in the next step: | ||
|
||
``` | ||
requester: | ||
type: HttpRequester | ||
name: "{{ options['name'] }}" | ||
url_base: "https://api.exchangeratesapi.io/v1/" | ||
http_method: "GET" | ||
request_options_provider: | ||
request_parameters: | ||
access_key: "{{ config.access_key }}" | ||
``` | ||
|
||
5. According to the ExchangeRatesApi documentation, we can specify the base currency of interest in a request parameter: | ||
|
||
``` | ||
request_options_provider: | ||
request_parameters: | ||
access_key: "{{ config.access_key }}" | ||
base: "{{ config.base }}" | ||
``` | ||
|
||
6. Let's populate the config so the connector can access the access key and base currency. | ||
First, we'll add these properties to the connector spec in | ||
`source-exchange-rates-tutorial/source_exchange_rates_tutorial/spec.yaml` | ||
|
||
``` | ||
documentationUrl: https://docs.airbyte.io/integrations/sources/exchangeratesapi | ||
connectionSpecification: | ||
$schema: http://json-schema.org/draft-07/schema# | ||
title: exchangeratesapi.io Source Spec | ||
type: object | ||
required: | ||
- access_key | ||
- base | ||
additionalProperties: false | ||
properties: | ||
access_key: | ||
type: string | ||
description: >- | ||
Your API Access Key. See <a | ||
href="https://exchangeratesapi.io/documentation/">here</a>. The key is | ||
case sensitive. | ||
airbyte_secret: true | ||
base: | ||
type: string | ||
description: >- | ||
ISO reference currency. See <a | ||
href="https://www.ecb.europa.eu/stats/policy_and_exchange_rates/euro_reference_exchange_rates/html/index.en.html">here</a>. | ||
examples: | ||
- EUR | ||
- USD | ||
``` | ||
|
||
7. We also need to fill in the connection config in the `secrets/config.json` | ||
Because of the sensitive nature of the access key, we recommend storing this config in the `secrets` directory because it is ignored by git. | ||
|
||
``` | ||
echo '{"access_key": "<your_access_key>", "base": "USD"}' > secrets/config.json | ||
``` | ||
|
||
We can now run the `check` operation, which verifies the connector can connect to the API source. | ||
|
||
``` | ||
python main.py check --config secrets/config.json | ||
``` | ||
|
||
which should now succeed with logs similar to: | ||
|
||
``` | ||
{"type": "LOG", "log": {"level": "INFO", "message": "Check succeeded"}} | ||
{"type": "CONNECTION_STATUS", "connectionStatus": {"status": "SUCCEEDED"}} | ||
``` | ||
|
||
## Next steps | ||
|
||
Next, we'll [extract the records from the response](4-reading-data.md) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the instructions depend on merging #14552 and #14923