Skip to content

Commit e4c942e

Browse files
authored
[python-cdk] README cleanup (#37306)
1 parent 33235c8 commit e4c942e

File tree

2 files changed

+116
-73
lines changed

2 files changed

+116
-73
lines changed

airbyte-cdk/python/README.md

+97-65
Original file line numberDiff line numberDiff line change
@@ -1,57 +1,79 @@
1-
# Connector Development Kit \(Python\)
1+
# Airbyte Python CDK and Low-Code CDK
22

3-
The Airbyte Python CDK is a framework for rapidly developing production-grade Airbyte connectors.The CDK currently offers helpers specific for creating Airbyte source connectors for:
3+
Airbyte Python CDK is a framework for building Airbyte API Source Connectors. It provides a set of
4+
classes and helpers that make it easy to build a connector against an HTTP API (REST, GraphQL, etc),
5+
or a generic Python source connector.
46

5-
- HTTP APIs \(REST APIs, GraphQL, etc..\)
6-
- Generic Python sources \(anything not covered by the above\)
7+
## Usage
78

8-
The CDK provides an improved developer experience by providing basic implementation structure and abstracting away low-level glue boilerplate.
9+
If you're looking to build a connector, we highly recommend that you
10+
[start with the Connector Builder](https://docs.airbyte.com/connector-development/connector-builder-ui/overview).
11+
It should be enough for 90% connectors out there. For more flexible and complex connectors, use the
12+
[low-code CDK and `SourceDeclarativeManifest`](https://docs.airbyte.com/connector-development/config-based/low-code-cdk-overview).
913

10-
This document is a general introduction to the CDK. Readers should have basic familiarity with the [Airbyte Specification](https://docs.airbyte.com/understanding-airbyte/airbyte-protocol/) before proceeding.
14+
If that doesn't work, then consider building on top of the
15+
[lower-level Python CDK itself](https://docs.airbyte.com/connector-development/cdk-python/).
1116

12-
# Setup
17+
### Quick Start
1318

14-
## Prerequisites
15-
16-
#### Poetry
17-
18-
Before you can start working on this project, you will need to have Poetry installed on your system. Please follow the instructions below to install Poetry:
19-
20-
1. Open your terminal or command prompt.
21-
2. Install Poetry using the recommended installation method:
19+
To get started on a Python CDK based connector or a low-code connector, you can generate a connector
20+
project from a template:
2221

2322
```bash
24-
curl -sSL https://install.python-poetry.org | POETRY_VERSION=1.5.1 python3 -
23+
# from the repo root
24+
cd airbyte-integrations/connector-templates/generator
25+
./generate.sh
2526
```
2627

27-
Alternatively, you can use `pip` to install Poetry:
28-
29-
```bash
30-
pip install --user poetry
31-
```
32-
33-
3. After the installation is complete, close and reopen your terminal to ensure the newly installed `poetry` command is available in your system's PATH.
34-
35-
For more detailed instructions and alternative installation methods, please refer to the official Poetry documentation: https://python-poetry.org/docs/#installation
36-
37-
### Concepts & Documentation
38-
39-
See the [concepts docs](docs/concepts/) for a tour through what the API offers.
40-
4128
### Example Connectors
4229

4330
**HTTP Connectors**:
4431

45-
- [Stripe](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-stripe/source_stripe/source.py)
46-
- [Slack](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-slack/source_slack/source.py)
32+
- [Stripe](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-stripe/)
33+
- [Salesforce](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-salesforce/)
4734

4835
**Simple Python connectors using the bare-bones `Source` abstraction**:
4936

5037
- [Google Sheets](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-google-sheets/google_sheets_source/google_sheets_source.py)
51-
- [Mailchimp](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-mailchimp/source_mailchimp/source.py)
38+
39+
This will generate a project with a type and a name of your choice and put it in
40+
`airbyte-integrations/connectors`. Open the directory with your connector in an editor and follow
41+
the `TODO` items.
42+
43+
## Python CDK Overview
44+
45+
Airbyte CDK code is within `airbyte_cdk` directory. Here's a high level overview of what's inside:
46+
47+
- `connector_builder`. Internal wrapper that helps the Connector Builder platform run a declarative
48+
manifest (low-code connector). You should not use this code directly. If you need to run a
49+
`SourceDeclarativeManifest`, take a look at
50+
[`source-declarative-manifest`](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-declarative-manifest)
51+
connector implementation instead.
52+
- `destinations`. Basic Destination connector support! If you're building a Destination connector in
53+
Python, try that. Some of our vector DB destinations like `destination-pinecone` are using that
54+
code.
55+
- `models` expose `airbyte_protocol.models` as a part of `airbyte_cdk` package.
56+
- `sources/concurrent_source` is the Concurrent CDK implementation. It supports reading data from
57+
streams concurrently per slice / partition, useful for connectors with high throughput and high
58+
number of records.
59+
- `sources/declarative` is the low-code CDK. It works on top of Airbyte Python CDK, but provides a
60+
declarative manifest language to define streams, operations, etc. This makes it easier to build
61+
connectors without writing Python code.
62+
- `sources/file_based` is the CDK for file-based sources. Examples include S3, Azure, GCS, etc.
63+
- `sources/singer` is a singer tap source adapter. Deprecated.
5264

5365
## Contributing
5466

67+
Thank you for being interested in contributing to Airbyte Python CDK! Here are some guidelines to
68+
get you started:
69+
70+
- We adhere to the [code of conduct](/CODE_OF_CONDUCT.md).
71+
- You can contribute by reporting bugs, posting github discussions, opening issues, improving [documentation](/docs/), and
72+
submitting pull requests with bugfixes and new features alike.
73+
- If you're changing the code, please add unit tests for your change.
74+
- When submitting issues or PRs, please add a small reproduction project. Using the changes in your
75+
connector and providing that connector code as an example (or a satellite PR) helps!
76+
5577
### First time setup
5678

5779
Install the project dependencies and development tools:
@@ -62,61 +84,58 @@ poetry install --all-extras
6284

6385
Installing all extras is required to run the full suite of unit tests.
6486

65-
#### Iteration
87+
#### Running tests locally
6688

6789
- Iterate on the CDK code locally
68-
- Run tests via `poetry run poe unit-test-with-cov`, or `python -m pytest -s unit_tests` if you want to pass pytest options.
69-
- Run `poetry run poe check-local` to lint all code, type-check modified code, and run unit tests with coverage in one command.
90+
- Run tests via `poetry run poe unit-test-with-cov`, or `python -m pytest -s unit_tests` if you want
91+
to pass pytest options.
92+
- Run `poetry run poe check-local` to lint all code, type-check modified code, and run unit tests
93+
with coverage in one command.
7094

7195
To see all available scripts, run `poetry run poe`.
7296

7397
##### Autogenerated files
7498

75-
If the iteration you are working on includes changes to the models or the connector generator, you might want to regenerate them. In order to do that, you can run:
99+
Low-code CDK models are generated from `sources/declarative/declarative_component_schema.yaml`. If
100+
the iteration you are working on includes changes to the models or the connector generator, you
101+
might want to regenerate them. In order to do that, you can run:
76102

77103
```bash
78104
poetry run poe build
79105
```
80106

81-
This will generate the code generator docker image and the component manifest files based on the schemas and templates.
107+
This will generate the code generator docker image and the component manifest files based on the
108+
schemas and templates.
82109

83110
#### Testing
84111

85-
All tests are located in the `unit_tests` directory. Run `poetry run poe unit-test-with-cov` to run them. This also presents a test coverage report. For faster iteration with no coverage report and more options, `python -m pytest -s unit_tests` is a good place to start.
112+
All tests are located in the `unit_tests` directory. Run `poetry run poe unit-test-with-cov` to run
113+
them. This also presents a test coverage report. For faster iteration with no coverage report and
114+
more options, `python -m pytest -s unit_tests` is a good place to start.
86115

87116
#### Building and testing a connector with your local CDK
88117

89-
When developing a new feature in the CDK, you may find it helpful to run a connector that uses that new feature. You can test this in one of two ways:
118+
When developing a new feature in the CDK, you may find it helpful to run a connector that uses that
119+
new feature. You can test this in one of two ways:
90120

91121
- Running a connector locally
92122
- Building and running a source via Docker
93123

94124
##### Installing your local CDK into a local Python connector
95125

96-
In order to get a local Python connector running your local CDK, do the following.
97-
98-
First, make sure you have your connector's virtual environment active:
99-
100-
```bash
101-
# from the `airbyte/airbyte-integrations/connectors/<connector-directory>` directory
102-
source .venv/bin/activate
103-
104-
# if you haven't installed dependencies for your connector already
105-
pip install -e .
106-
```
126+
Open the connector's `pyproject.toml` file and replace the line with `airbyte_cdk` with the
127+
following:
107128

108-
Then, navigate to the CDK and install it in editable mode:
109-
110-
```bash
111-
cd ../../../airbyte-cdk/python
112-
pip install -e .
129+
```toml
130+
airbyte_cdk = { path = "../../../airbyte-cdk/python/airbyte_cdk", develop = true }
113131
```
114132

115-
You should see that `pip` has uninstalled the version of `airbyte-cdk` defined by your connector's `setup.py` and installed your local CDK. Any changes you make will be immediately reflected in your editor, so long as your editor's interpreter is set to your connector's virtual environment.
133+
Then, running `poetry update` should reinstall `airbyte_cdk` from your local working directory.
116134

117135
##### Building a Python connector in Docker with your local CDK installed
118136

119-
_Pre-requisite: Install the [`airbyte-ci` CLI](https://github.com/airbytehq/airbyte/blob/master/airbyte-ci/connectors/pipelines/README.md)_
137+
_Pre-requisite: Install the
138+
[`airbyte-ci` CLI](https://github.com/airbytehq/airbyte/blob/master/airbyte-ci/connectors/pipelines/README.md)_
120139

121140
You can build your connector image with the local CDK using
122141

@@ -125,23 +144,30 @@ You can build your connector image with the local CDK using
125144
airbyte-ci connectors --use-local-cdk --name=<CONNECTOR> build
126145
```
127146

128-
Note that the local CDK is injected at build time, so if you make changes, you will have to run the build command again to see them reflected.
147+
Note that the local CDK is injected at build time, so if you make changes, you will have to run the
148+
build command again to see them reflected.
129149

130150
##### Running Connector Acceptance Tests for a single connector in Docker with your local CDK installed
131151

132-
_Pre-requisite: Install the [`airbyte-ci` CLI](https://github.com/airbytehq/airbyte/blob/master/airbyte-ci/connectors/pipelines/README.md)_
152+
_Pre-requisite: Install the
153+
[`airbyte-ci` CLI](https://github.com/airbytehq/airbyte/blob/master/airbyte-ci/connectors/pipelines/README.md)_
133154

134-
To run acceptance tests for a single connectors using the local CDK, from the connector directory, run
155+
To run acceptance tests for a single connectors using the local CDK, from the connector directory,
156+
run
135157

136158
```bash
137159
airbyte-ci connectors --use-local-cdk --name=<CONNECTOR> test
138160
```
139161

140162
#### When you don't have access to the API
141163

142-
There may be a time when you do not have access to the API (either because you don't have the credentials, network access, etc...) You will probably still want to do end-to-end testing at least once. In order to do so, you can emulate the server you would be reaching using a server stubbing tool.
164+
There may be a time when you do not have access to the API (either because you don't have the
165+
credentials, network access, etc...) You will probably still want to do end-to-end testing at least
166+
once. In order to do so, you can emulate the server you would be reaching using a server stubbing
167+
tool.
143168

144-
For example, using [mockserver](https://www.mock-server.com/), you can set up an expectation file like this:
169+
For example, using [mockserver](https://www.mock-server.com/), you can set up an expectation file
170+
like this:
145171

146172
```json
147173
{
@@ -155,13 +181,19 @@ For example, using [mockserver](https://www.mock-server.com/), you can set up an
155181
}
156182
```
157183

158-
Assuming this file has been created at `secrets/mock_server_config/expectations.json`, running the following command will allow to match any requests on path `/data` to return the response defined in the expectation file:
184+
Assuming this file has been created at `secrets/mock_server_config/expectations.json`, running the
185+
following command will allow to match any requests on path `/data` to return the response defined in
186+
the expectation file:
159187

160188
```bash
161189
docker run -d --rm -v $(pwd)/secrets/mock_server_config:/config -p 8113:8113 --env MOCKSERVER_LOG_LEVEL=TRACE --env MOCKSERVER_SERVER_PORT=8113 --env MOCKSERVER_WATCH_INITIALIZATION_JSON=true --env MOCKSERVER_PERSISTED_EXPECTATIONS_PATH=/config/expectations.json --env MOCKSERVER_INITIALIZATION_JSON_PATH=/config/expectations.json mockserver/mockserver:5.15.0
162190
```
163191

164-
HTTP requests to `localhost:8113/data` should now return the body defined in the expectations file. To test this, the implementer either has to change the code which defines the base URL for Python source or update the `url_base` from low-code. With the Connector Builder running in docker, you will have to use domain `host.docker.internal` instead of `localhost` as the requests are executed within docker.
192+
HTTP requests to `localhost:8113/data` should now return the body defined in the expectations file.
193+
To test this, the implementer either has to change the code which defines the base URL for Python
194+
source or update the `url_base` from low-code. With the Connector Builder running in docker, you
195+
will have to use domain `host.docker.internal` instead of `localhost` as the requests are executed
196+
within docker.
165197

166198
#### Publishing a new version to PyPi
167199

Original file line numberDiff line numberDiff line change
@@ -1,42 +1,53 @@
11
# Connector Builder Backend
22

3-
This is the backend for requests from the [Connector Builder](https://docs.airbyte.com/connector-development/connector-builder-ui/overview/).
3+
This is the backend for requests from the
4+
[Connector Builder](https://docs.airbyte.com/connector-development/connector-builder-ui/overview/).
45

56
## Local development
67

78
### Locally running the Connector Builder backend
89

9-
```
10+
```bash
1011
python main.py read --config path/to/config --catalog path/to/catalog
1112
```
1213

1314
Note:
14-
- Requires the keys `__injected_declarative_manifest` and `__command` in its config, where `__injected_declarative_manifest` is a JSON manifest and `__command` is one of the commands handled by the ConnectorBuilderHandler (`stream_read` or `resolve_manifest`), i.e.
15-
```
15+
16+
- Requires the keys `__injected_declarative_manifest` and `__command` in its config, where
17+
`__injected_declarative_manifest` is a JSON manifest and `__command` is one of the commands
18+
handled by the ConnectorBuilderHandler (`stream_read` or `resolve_manifest`), i.e.
19+
20+
```json
1621
{
1722
"config": <normal config>,
1823
"__injected_declarative_manifest": {...},
1924
"__command": <"resolve_manifest" | "test_read">
2025
}
2126
```
22-
*See [ConnectionSpecification](https://docs.airbyte.com/understanding-airbyte/airbyte-protocol/#actor-specification) for details on the `"config"` key if needed.
27+
28+
\*See
29+
[ConnectionSpecification](https://docs.airbyte.com/understanding-airbyte/airbyte-protocol/#actor-specification)
30+
for details on the `"config"` key if needed.
2331

2432
- When the `__command` is `resolve_manifest`, the argument to `catalog` should be an empty string.
25-
- The config can optionally contain an object under the `__test_read_config` key which can define custom test read limits with `max_records`, `max_slices`, and `max_pages_per_slice` properties. All custom limits are optional; a default will be used for any limit that is not provided.
33+
- The config can optionally contain an object under the `__test_read_config` key which can define
34+
custom test read limits with `max_records`, `max_slices`, and `max_pages_per_slice` properties.
35+
All custom limits are optional; a default will be used for any limit that is not provided.
2636

2737
### Locally running the docker image
2838

2939
#### Build
3040

3141
First, make sure you build the latest Docker image:
32-
```
42+
43+
```bash
3344
docker build -t airbyte/source-declarative-manifest:dev .
3445
```
3546

3647
#### Run
3748

3849
Then run any of the connector commands as follows:
3950

40-
```
51+
```bash
4152
docker run --rm -v $(pwd)/secrets:/secrets airbyte/source-declarative-manifest:dev read --config /secrets/config.json
4253
```

0 commit comments

Comments
 (0)