Skip to content

Tutorial and documentation for config-based connectors #15027

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 99 commits into from
Aug 12, 2022
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
99 commits
Select commit Hold shift + click to select a range
4855a72
5-step tutorial
girarda Jul 25, 2022
138bd52
move
girarda Jul 26, 2022
637c2a7
tiny bit of editing
girarda Jul 26, 2022
9fabad8
Merge branch 'master' into alex/lowcodeTutorial
girarda Jul 28, 2022
ff775e3
Update tutorial
girarda Jul 28, 2022
6ebee74
update docs
girarda Aug 1, 2022
ff2b602
reset
girarda Aug 1, 2022
906f915
move files
girarda Aug 1, 2022
a64c758
record selector, request options, and more links
girarda Aug 1, 2022
2099b24
update
girarda Aug 1, 2022
8bab845
update
girarda Aug 1, 2022
a03e6f3
connector definition
girarda Aug 1, 2022
d444d74
link
girarda Aug 1, 2022
71e0a5b
links
girarda Aug 2, 2022
a9512ab
Merge branch 'master' into alex/lowcodeTutorial
girarda Aug 2, 2022
7b36ca6
update example
girarda Aug 2, 2022
218bfd9
footnote
girarda Aug 2, 2022
78599e3
typo
girarda Aug 2, 2022
c9bfb99
document string interpolation
girarda Aug 2, 2022
58567c5
note on string interpolation
girarda Aug 2, 2022
76a95ae
update
girarda Aug 2, 2022
8feb1a6
Merge branch 'master' into alex/lowcodeTutorial
girarda Aug 2, 2022
ecf9b34
fix code sample
girarda Aug 2, 2022
990d44a
fix
girarda Aug 2, 2022
f9b1b68
update sample
girarda Aug 2, 2022
945cc3e
fix
girarda Aug 2, 2022
a3349df
use the actual config
girarda Aug 2, 2022
318e613
Update as per comments
girarda Aug 7, 2022
c54b0c4
Merge branch 'master' into alex/lowcodeTutorial
girarda Aug 8, 2022
9cc1e4b
write as yaml
girarda Aug 8, 2022
f096296
typo
girarda Aug 8, 2022
8bd35b4
Clarify options overloading
girarda Aug 8, 2022
cfb4528
clarify that docker must be running
girarda Aug 8, 2022
85d5afb
remove extra footnote
girarda Aug 8, 2022
61a75b5
use venv directly
girarda Aug 8, 2022
7e1dc95
Apply suggestions from code review
girarda Aug 8, 2022
3df5071
signup instructions
girarda Aug 8, 2022
b074832
update
girarda Aug 8, 2022
672eb16
clarify that both dot and bracket notations are interchangeable
girarda Aug 8, 2022
6575c9d
Clarify how check works
girarda Aug 8, 2022
e747b4a
create spec and config before updating connector definition
girarda Aug 8, 2022
d5ac31d
clarify what now_local() is
girarda Aug 8, 2022
fdce2c6
rename to yaml structure
girarda Aug 8, 2022
198b421
Go through tutorial and update end of section code samples
girarda Aug 9, 2022
18bc40f
fix link
girarda Aug 9, 2022
f4e5ed4
update
girarda Aug 9, 2022
83e3845
update code samples
girarda Aug 9, 2022
bab017b
Update code samples
girarda Aug 9, 2022
37d1fde
Update to bracket notation
girarda Aug 9, 2022
1fc83fe
remove superfluous comments
girarda Aug 9, 2022
6944916
Update docs/connector-development/config-based/tutorial/2-install-dep…
girarda Aug 9, 2022
c317a49
Update docs/connector-development/config-based/tutorial/3-connecting-…
girarda Aug 9, 2022
096a370
Update docs/connector-development/config-based/tutorial/3-connecting-…
girarda Aug 9, 2022
ff804be
Update docs/connector-development/config-based/tutorial/3-connecting-…
girarda Aug 9, 2022
49be031
Update docs/connector-development/config-based/tutorial/3-connecting-…
girarda Aug 9, 2022
6790422
Update docs/connector-development/config-based/tutorial/3-connecting-…
girarda Aug 9, 2022
34214b4
Update docs/connector-development/config-based/tutorial/4-reading-dat…
girarda Aug 9, 2022
bf9a205
fix path
girarda Aug 9, 2022
74e4de8
update
girarda Aug 9, 2022
ca0f93c
motivation blurp
girarda Aug 9, 2022
46b2ee4
Merge branch 'master' into alex/lowcodeTutorial
girarda Aug 10, 2022
9cfd223
warning
girarda Aug 10, 2022
65a966c
warning
girarda Aug 10, 2022
dd4437c
fix code block
girarda Aug 10, 2022
365c0dc
update code samples
girarda Aug 10, 2022
ebaa701
update code sample
girarda Aug 10, 2022
aacc30a
update code samples
girarda Aug 10, 2022
3b1e85f
small updates
girarda Aug 10, 2022
b4498f3
update yaml structure
girarda Aug 10, 2022
306e9e5
custom class example
girarda Aug 10, 2022
c2d9b86
language annotations
girarda Aug 10, 2022
562844b
update warning
girarda Aug 11, 2022
faada9a
Merge branch 'master' into alex/lowcodeTutorial
girarda Aug 11, 2022
08487f7
Update tutorial to use dpath extractor
girarda Aug 11, 2022
30d25c0
Update record selector docs
girarda Aug 11, 2022
63a295c
unit test
girarda Aug 11, 2022
019cc0a
link to contributing
girarda Aug 12, 2022
117ee2f
tiny update
girarda Aug 12, 2022
3a00dac
$ in front of commands
girarda Aug 12, 2022
b2040fc
$ in front of commands
girarda Aug 12, 2022
db243a8
More readings
girarda Aug 12, 2022
cc0d76c
link to existing config-based connectors
girarda Aug 12, 2022
6cbdaa0
index
girarda Aug 12, 2022
619bf37
update
girarda Aug 12, 2022
9a4f1c9
delete broken link
girarda Aug 12, 2022
5337868
supported features
girarda Aug 12, 2022
e4919d5
update
girarda Aug 12, 2022
e27fade
Add some links
girarda Aug 12, 2022
048bddb
Update docs/connector-development/config-based/overview.md
girarda Aug 12, 2022
019aad2
Update docs/connector-development/config-based/record-selector.md
girarda Aug 12, 2022
cc308f2
Update docs/connector-development/config-based/overview.md
girarda Aug 12, 2022
e3cffc8
Update docs/connector-development/config-based/overview.md
girarda Aug 12, 2022
785a3e4
Update docs/connector-development/config-based/overview.md
girarda Aug 12, 2022
2db7694
mention the unit
girarda Aug 12, 2022
2a7d5fc
headers
girarda Aug 12, 2022
eba0322
remove mentions of interpolating on stream slice, etc.
girarda Aug 12, 2022
e7f0023
Merge branch 'master' into alex/lowcodeTutorial
girarda Aug 12, 2022
8353121
update
girarda Aug 12, 2022
e6637d3
exclude config-based docs
girarda Aug 12, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Getting Started

## Summary

Throughout this tutorial, we'll walk you through the creation an Airbyte source to read data from an HTTP API.

We'll build a connector reading data from the Exchange Rates API, but the steps we'll go through will apply to other HTTP APIs you might be interested in integrating with.

The API documentations can be found [here](https://exchangeratesapi.io/documentation/).
In this tutorial, we will read data from the following endpoints:

- `Latest Rates Endpoint`
- `Historical Rates Endpoint`

With the end goal of implementing a Source with a single `Stream` containing exchange rates going from a base currency to many other currencies.
The output schema of our stream will look like

```json
{
"base": "USD",
"date": "2022-07-15",
"rates": {
"CAD": 1.28,
"EUR": 0.98
}
}
```

## Exchange Rates API Setup

Before we can get started, you'll need to generate an API access key for the Exchange Rates API.
This can be done by signing up for the Free tier plan on [Exchange Rates API](https://exchangeratesapi.io/).

## Requirements

- Python >= 3.9
- Docker
- NodeJS

## Next Steps

Next, we'll [create a Source using the connector generator.](./1-create-source.md)
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Step 1: Create the Source

Let's start by cloning the Airbyte repository

```
git clone [email protected]:airbytehq/airbyte.git
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the instructions depend on merging #14552 and #14923

```

Airbyte provides a code generator which bootstraps the scaffolding for our connector.

```
cd airbyte-integrations/connector-templates/generator
./generate.sh
```

This will bring up an interactive helper application. Use the arrow keys to pick a template from the list. Select the `Configuration Based Source` template and then input the name of your connector. The application will create a new directory in `airbyte/airbyte-integrations/connectors/` with the name of your new connector.

```
Configuration Based Source
Source name: exchange-rates-tutorial
```

For this walkthrough, we'll refer to our source as `exchange-rates-tutorial`. The complete source code for this tutorial can be found here <FIXME: there should be a link to the complete tutorial?

## Next steps

Next, [we'll install dependencies required to run the connector](./2-install-dependencies.md)
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Step 2: Install dependencies

Let's create a python virtual environment for our source.
You can do this by executing the following commands from the root of the Airbyte repository.

The command below assume that `python` points to a version of python &gt;=3.9.0. On some systems, `python` points to a Python2 installation and `python3` points to Python3.
If this is the case on your machine, substitute the `python` commands with `python3`.
The subsequent `python` invocations will use the virtual environment created for the connector.

```
python tools/bin/update_intellij_venv.py -modules source-exchange-rates-tutorial --install-venv
cd airbyte-integrations/connectors/source-exchange-rates-tutorial
source .venv/bin/activate
```

These steps create an initial python environment, and install the dependencies required to run an API Source connector.

Let's verify everything works as expected by running the Airbyte `spec` operation:

```
python main.py spec

```

You should see an output similar to the one below:

```
{"type": "SPEC", "spec": {"documentationUrl": "https://docsurl.com", "connectionSpecification": {"$schema": "http://json-schema.org/draft-07/schema#", "title": "Python Http Tutorial Spec", "type": "object", "required": ["TODO"], "additionalProperties": false, "properties": {"TODO: This schema defines the configuration required for the source. This usually involves metadata such as database and/or authentication information.": {"type": "string", "description": "describe me"}}}}}
```

More details on the `spec` operation can be found in [Basic Concepts](https://docs.airbyte.com/connector-development/cdk-python/basic-concepts) and [Defining Stream Schemas](https://docs.airbyte.com/connector-development/cdk-python/schemas), but this is a simple sanity check to make sure everything is wired up correctly.

For now, note that the `main.py` file is a convenience wrapper to help run the connector.
Its invocation format is `python main.py <command> [args]`.
The module's generated `README.md` contains more details on the supported commands.

## Next steps

Next, we'll [connect to the API source](./3-connecting.md)

## More readings

- [Basic Concepts](https://docs.airbyte.com/connector-development/cdk-python/basic-concepts)
- [Defining Stream Schemas](https://docs.airbyte.com/connector-development/cdk-python/schemas)
- The module's generated `README.md` contains more details on the supported commands.
Original file line number Diff line number Diff line change
@@ -0,0 +1,180 @@
# Step 3: Connecting to the API

We're now ready to start implementing the connector.

The code generator already created a boilerplate connector definition in `source-exchange-rates-tutorial/source_exchange_rates_tutorial/exchange_rates_tutorial.yaml`

```
schema_loader:
type: JsonSchema
file_path: "./source_exchange_rates_tutorial/schemas/{{ name }}.json"
selector:
type: RecordSelector
extractor:
type: JelloExtractor
transform: "_"
requester:
type: HttpRequester
name: "{{ options['name'] }}"
url_base: TODO "your_api_base_url"
http_method: "GET"
authenticator:
type: TokenAuthenticator
token: "{{ config['api_key'] }}"
retriever:
type: SimpleRetriever
name: "{{ options['name'] }}"
primary_key: "{{ options['primary_key'] }}"
record_selector:
ref: "*ref(selector)"
paginator:
type: NoPagination
state:
class_name: airbyte_cdk.sources.declarative.states.dict_state.DictState
customers_stream:
type: DeclarativeStream
options:
name: "customers"
primary_key: "id"
schema_loader:
ref: "*ref(schema_loader)"
retriever:
ref: "*ref(retriever)"
requester:
ref: "*ref(requester)"
path: TODO "your_endpoint_path"
streams:
- "*ref(customers_stream)"
check:
type: CheckStream
stream_names: ["customers_stream"]
```

Let's fill this out these TODOs with the information found in the exchange rates api docs https://exchangeratesapi.io/documentation/

1. First, let's rename the stream from `customers` to `rates.

```
rates_stream:
type: DeclarativeStream
options:
name: "rates"
```

and update the references in the streams list and check block

```
streams:
- "*ref(rates_stream)"
check:
type: CheckStream
stream_names: ["rates_stream"]
```

2. Next we'll set the base url.
According to the API documentation, the base url is "https://api.exchangeratesapi.io/v1/".
This can be set in the requester definition.

```
requester:
type: HttpRequester
name: "{{ options['name'] }}"
url_base: "https://api.exchangeratesapi.io/v1/"
```

3. We can fetch the latest data by submitting a request to "/latest". This path is specific to the stream, so we'll set within the `rates_stream` definition.

```
rates_stream:
type: DeclarativeStream
options:
name: "rates"
primary_key: "id"
schema_loader:
ref: "*ref(schema_loader)"
retriever:
ref: "*ref(retriever)"
requester:
ref: "*ref(requester)"
path: "/latest"
```

4. Next, we'll set up the authentication.
The Exchange Rates API requires an access key, which we'll need to make accessible to our connector.
We'll configure the connector to use this access key by setting the access key in a request parameter and pointing to a field in the config, which we'll populate in the next step:

```
requester:
type: HttpRequester
name: "{{ options['name'] }}"
url_base: "https://api.exchangeratesapi.io/v1/"
http_method: "GET"
request_options_provider:
request_parameters:
access_key: "{{ config.access_key }}"
```

5. According to the ExchangeRatesApi documentation, we can specify the base currency of interest in a request parameter:

```
request_options_provider:
request_parameters:
access_key: "{{ config.access_key }}"
base: "{{ config.base }}"
```

6. Let's populate the config so the connector can access the access key and base currency.
First, we'll add these properties to the connector spec in
`source-exchange-rates-tutorial/source_exchange_rates_tutorial/spec.yaml`

```
documentationUrl: https://docs.airbyte.io/integrations/sources/exchangeratesapi
connectionSpecification:
$schema: http://json-schema.org/draft-07/schema#
title: exchangeratesapi.io Source Spec
type: object
required:
- access_key
- base
additionalProperties: false
properties:
access_key:
type: string
description: >-
Your API Access Key. See <a
href="https://exchangeratesapi.io/documentation/">here</a>. The key is
case sensitive.
airbyte_secret: true
base:
type: string
description: >-
ISO reference currency. See <a
href="https://www.ecb.europa.eu/stats/policy_and_exchange_rates/euro_reference_exchange_rates/html/index.en.html">here</a>.
examples:
- EUR
- USD
```

7. We also need to fill in the connection config in the `secrets/config.json`
Because of the sensitive nature of the access key, we recommend storing this config in the `secrets` directory because it is ignored by git.

```
echo '{"access_key": "<your_access_key>", "base": "USD"}' > secrets/config.json
```

We can now run the `check` operation, which verifies the connector can connect to the API source.

```
python main.py check --config secrets/config.json
```

which should now succeed with logs similar to:

```
{"type": "LOG", "log": {"level": "INFO", "message": "Check succeeded"}}
{"type": "CONNECTION_STATUS", "connectionStatus": {"status": "SUCCEEDED"}}
```

## Next steps

Next, we'll [extract the records from the response](4-reading-data.md)
Loading