Skip to content

Commit cd286a6

Browse files
authored
✨ source-google-drive: migrate to poetry (#36581)
1 parent 6c8ca12 commit cd286a6

File tree

8 files changed

+2612
-125
lines changed

8 files changed

+2612
-125
lines changed
Original file line numberDiff line numberDiff line change
@@ -1,69 +1,55 @@
1-
# Google Drive Source
1+
# Google Drive source connector
2+
23

34
This is the repository for the Google Drive source connector, written in Python.
4-
For information about how to use this connector within Airbyte, see [the documentation](https://docs.airbyte.io/integrations/sources/google-drive).
5+
For information about how to use this connector within Airbyte, see [the documentation](https://docs.airbyte.com/integrations/sources/google-drive).
56

67
## Local development
78

89
### Prerequisites
9-
**To iterate on this connector, make sure to complete this prerequisites section.**
10-
11-
#### Minimum Python version required `= 3.10.0`
10+
* Python (~=3.9)
11+
* Poetry (~=1.7) - installation instructions [here](https://python-poetry.org/docs/#installation)
1212

13-
#### Build & Activate Virtual Environment and install dependencies
14-
From this connector directory, create a virtual environment:
15-
```
16-
python -m venv .venv
17-
```
1813

19-
This will generate a virtualenv for this module in `.venv/`. Make sure this venv is active in your
20-
development environment of choice. To activate it from the terminal, run:
21-
```
22-
source .venv/bin/activate
23-
pip install -r requirements.txt
24-
pip install '.[tests]'
14+
### Installing the connector
15+
From this connector directory, run:
16+
```bash
17+
poetry install --with dev
2518
```
26-
If you are in an IDE, follow your IDE's instructions to activate the virtualenv.
2719

28-
Note that while we are installing dependencies from `requirements.txt`, you should only edit `setup.py` for your dependencies. `requirements.txt` is
29-
used for editable installs (`pip install -e`) to pull in Python dependencies from the monorepo and will call `setup.py`.
30-
If this is mumbo jumbo to you, don't worry about it, just put your deps in `setup.py` but install using `pip install -r requirements.txt` and everything
31-
should work as you expect.
3220

33-
#### Create credentials
34-
**If you are a community contributor**, follow the instructions in the [documentation](https://docs.airbyte.io/integrations/sources/google-drive)
35-
to generate the necessary credentials. Then create a file `secrets/config.json` conforming to the `source_google_drive/spec.json` file.
21+
### Create credentials
22+
**If you are a community contributor**, follow the instructions in the [documentation](https://docs.airbyte.com/integrations/sources/google-drive)
23+
to generate the necessary credentials. Then create a file `secrets/config.json` conforming to the `source_google_drive/spec.yaml` file.
3624
Note that any directory named `secrets` is gitignored across the entire Airbyte repo, so there is no danger of accidentally checking in sensitive information.
37-
See `integration_tests/sample_config.json` for a sample config file.
25+
See `sample_files/sample_config.json` for a sample config file.
3826

39-
**If you are an Airbyte core member**, copy the credentials in Lastpass under the secret name `source google-drive test creds`
40-
and place them into `secrets/config.json`.
4127

4228
### Locally running the connector
4329
```
44-
python main.py spec
45-
python main.py check --config secrets/config.json
46-
python main.py discover --config secrets/config.json
47-
python main.py read --config secrets/config.json --catalog integration_tests/configured_catalog.json
30+
poetry run source-google-drive spec
31+
poetry run source-google-drive check --config secrets/config.json
32+
poetry run source-google-drive discover --config secrets/config.json
33+
poetry run source-google-drive read --config secrets/config.json --catalog sample_files/configured_catalog.json
4834
```
4935

50-
### Locally running the connector docker image
51-
36+
### Running unit tests
37+
To run unit tests locally, from the connector directory run:
38+
```
39+
poetry run pytest unit_tests
40+
```
5241

53-
#### Build
54-
**Via [`airbyte-ci`](https://github.com/airbytehq/airbyte/blob/master/airbyte-ci/connectors/pipelines/README.md) (recommended):**
42+
### Building the docker image
43+
1. Install [`airbyte-ci`](https://github.com/airbytehq/airbyte/blob/master/airbyte-ci/connectors/pipelines/README.md)
44+
2. Run the following command to build the docker image:
5545
```bash
5646
airbyte-ci connectors --name=source-google-drive build
5747
```
5848

59-
An image will be built with the tag `airbyte/source-google-drive:dev`.
49+
An image will be available on your host with the tag `airbyte/source-google-drive:dev`.
6050

61-
**Via `docker build`:**
62-
```bash
63-
docker build -t airbyte/source-google-drive:dev .
64-
```
6551

66-
#### Run
52+
### Running as a docker container
6753
Then run any of the connector commands as follows:
6854
```
6955
docker run --rm airbyte/source-google-drive:dev spec
@@ -72,29 +58,34 @@ docker run --rm -v $(pwd)/secrets:/secrets airbyte/source-google-drive:dev disco
7258
docker run --rm -v $(pwd)/secrets:/secrets -v $(pwd)/integration_tests:/integration_tests airbyte/source-google-drive:dev read --config /secrets/config.json --catalog /integration_tests/configured_catalog.json
7359
```
7460

75-
## Testing
61+
### Running our CI test suite
7662
You can run our full test suite locally using [`airbyte-ci`](https://github.com/airbytehq/airbyte/blob/master/airbyte-ci/connectors/pipelines/README.md):
7763
```bash
7864
airbyte-ci connectors --name=source-google-drive test
7965
```
8066

8167
### Customizing acceptance Tests
82-
Customize `acceptance-test-config.yml` file to configure tests. See [Connector Acceptance Tests](https://docs.airbyte.com/connector-development/testing-connectors/connector-acceptance-tests-reference) for more information.
68+
Customize `acceptance-test-config.yml` file to configure acceptance tests. See [Connector Acceptance Tests](https://docs.airbyte.com/connector-development/testing-connectors/connector-acceptance-tests-reference) for more information.
8369
If your connector requires to create or destroy resources for use during acceptance tests create fixtures for it and place them inside integration_tests/acceptance.py.
8470

85-
## Dependency Management
86-
All of your dependencies should go in `setup.py`, NOT `requirements.txt`. The requirements file is only used to connect internal Airbyte dependencies in the monorepo for local development.
87-
We split dependencies between two groups, dependencies that are:
88-
* required for your connector to work need to go to `MAIN_REQUIREMENTS` list.
89-
* required for the testing need to go to `TEST_REQUIREMENTS` list
71+
### Dependency Management
72+
All of your dependencies should be managed via Poetry.
73+
To add a new dependency, run:
74+
```bash
75+
poetry add <package-name>
76+
```
77+
78+
Please commit the changes to `pyproject.toml` and `poetry.lock` files.
9079

91-
### Publishing a new version of the connector
80+
## Publishing a new version of the connector
9281
You've checked out the repo, implemented a million dollar feature, and you're ready to share your changes with the world. Now what?
9382
1. Make sure your changes are passing our test suite: `airbyte-ci connectors --name=source-google-drive test`
94-
2. Bump the connector version in `metadata.yaml`: increment the `dockerImageTag` value. Please follow [semantic versioning for connectors](https://docs.airbyte.com/contributing-to-airbyte/resources/pull-requests-handbook/#semantic-versioning-for-connectors).
83+
2. Bump the connector version (please follow [semantic versioning for connectors](https://docs.airbyte.com/contributing-to-airbyte/resources/pull-requests-handbook/#semantic-versioning-for-connectors)):
84+
- bump the `dockerImageTag` value in in `metadata.yaml`
85+
- bump the `version` value in `pyproject.toml`
9586
3. Make sure the `metadata.yaml` content is up to date.
96-
4. Make the connector documentation and its changelog is up to date (`docs/integrations/sources/google-drive.md`).
87+
4. Make sure the connector documentation and its changelog is up to date (`docs/integrations/sources/google-drive.md`).
9788
5. Create a Pull Request: use [our PR naming conventions](https://docs.airbyte.com/contributing-to-airbyte/resources/pull-requests-handbook/#pull-request-title-convention).
9889
6. Pat yourself on the back for being an awesome contributor.
9990
7. Someone from Airbyte will take a look at your PR and iterate with you to merge it into master.
100-
91+
8. Once your PR is merged, the new version of the connector will be automatically published to Docker Hub and our connector registry.

airbyte-integrations/connectors/source-google-drive/integration_tests/spec.json

+16-10
Original file line numberDiff line numberDiff line change
@@ -31,9 +31,9 @@
3131
},
3232
"globs": {
3333
"title": "Globs",
34+
"description": "The pattern used to specify which files should be selected from the file system. For more information on glob pattern matching look <a href=\"https://en.wikipedia.org/wiki/Glob_(programming)\">here</a>.",
3435
"default": ["**"],
3536
"order": 1,
36-
"description": "The pattern used to specify which files should be selected from the file system. For more information on glob pattern matching look <a href=\"https://en.wikipedia.org/wiki/Glob_(programming)\">here</a>.",
3737
"type": "array",
3838
"items": {
3939
"type": "string"
@@ -53,8 +53,8 @@
5353
"primary_key": {
5454
"title": "Primary Key",
5555
"description": "The column or columns (for a composite key) that serves as the unique identifier of a record. If empty, the primary key will default to the parser's default primary key.",
56-
"type": "string",
57-
"airbyte_hidden": true
56+
"airbyte_hidden": true,
57+
"type": "string"
5858
},
5959
"days_to_sync_if_history_is_full": {
6060
"title": "Days To Sync If History Is Full",
@@ -229,6 +229,12 @@
229229
"type": "string"
230230
},
231231
"uniqueItems": true
232+
},
233+
"ignore_errors_on_fields_mismatch": {
234+
"title": "Ignore errors on field mismatch",
235+
"description": "Whether to ignore errors that occur when the number of fields in the CSV does not match the number of columns in the schema.",
236+
"default": false,
237+
"type": "boolean"
232238
}
233239
},
234240
"required": ["filetype"]
@@ -276,20 +282,20 @@
276282
"type": "string"
277283
},
278284
"skip_unprocessable_files": {
279-
"type": "boolean",
280-
"default": true,
281285
"title": "Skip Unprocessable Files",
282286
"description": "If true, skip files that cannot be parsed and pass the error message along as the _ab_source_file_parse_error field. If false, fail the sync.",
283-
"always_show": true
287+
"default": true,
288+
"always_show": true,
289+
"type": "boolean"
284290
},
285291
"strategy": {
286-
"type": "string",
292+
"title": "Parsing Strategy",
293+
"description": "The strategy used to parse documents. `fast` extracts text directly from the document which doesn't work for all files. `ocr_only` is more reliable, but slower. `hi_res` is the most reliable, but requires an API key and a hosted instance of unstructured and can't be used with local mode. See the unstructured.io documentation for more details: https://unstructured-io.github.io/unstructured/core/partition.html#partition-pdf",
294+
"default": "auto",
287295
"always_show": true,
288296
"order": 0,
289-
"default": "auto",
290-
"title": "Parsing Strategy",
291297
"enum": ["auto", "fast", "ocr_only", "hi_res"],
292-
"description": "The strategy used to parse documents. `fast` extracts text directly from the document which doesn't work for all files. `ocr_only` is more reliable, but slower. `hi_res` is the most reliable, but requires an API key and a hosted instance of unstructured and can't be used with local mode. See the unstructured.io documentation for more details: https://unstructured-io.github.io/unstructured/core/partition.html#partition-pdf"
298+
"type": "string"
293299
},
294300
"processing": {
295301
"title": "Processing",

airbyte-integrations/connectors/source-google-drive/metadata.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ data:
77
connectorSubtype: file
88
connectorType: source
99
definitionId: 9f8dda77-1048-4368-815b-269bf54ee9b8
10-
dockerImageTag: 0.0.9
10+
dockerImageTag: 0.0.10
1111
dockerRepository: airbyte/source-google-drive
1212
githubIssueLabel: source-google-drive
1313
icon: google-drive.svg

0 commit comments

Comments
 (0)