Skip to content

Commit 88d743b

Browse files
Xabilahumarcosmarxmoctavia-squidington-iii
authored
🎉 New Source: GNews [low-code CDK] (#18808)
* Initial GNews source connector implementation * Update changelog with PR id * Add support for incremental syncs and error handling * Make tests pass * run format * auto-bump connector version Co-authored-by: marcosmarxm <[email protected]> Co-authored-by: Octavia Squidington III <[email protected]> Co-authored-by: Marcos Marx <[email protected]>
1 parent 233dfd1 commit 88d743b

29 files changed

+1040
-0
lines changed

airbyte-config/init/src/main/resources/seed/source_definitions.yaml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -482,6 +482,13 @@
482482
icon: glassfrog.svg
483483
sourceType: api
484484
releaseStage: alpha
485+
- name: GNews
486+
sourceDefinitionId: ce38aec4-5a77-439a-be29-9ca44fd4e811
487+
dockerRepository: airbyte/source-gnews
488+
dockerImageTag: 0.1.0
489+
documentationUrl: https://docs.airbyte.com/integrations/sources/gnews
490+
sourceType: api
491+
releaseStage: alpha
485492
- name: GoCardless
486493
sourceDefinitionId: ba15ac82-5c6a-4fb2-bf24-925c23a1180c
487494
dockerRepository: airbyte/source-gocardless

airbyte-config/init/src/main/resources/seed/source_specs.yaml

Lines changed: 229 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4271,6 +4271,235 @@
42714271
supportsNormalization: false
42724272
supportsDBT: false
42734273
supported_destination_sync_modes: []
4274+
- dockerImage: "airbyte/source-gnews:0.1.0"
4275+
spec:
4276+
documentationUrl: "https://docs.airbyte.com/integrations/sources/gnews"
4277+
connectionSpecification:
4278+
$schema: "http://json-schema.org/draft-07/schema#"
4279+
title: "Gnews Spec"
4280+
type: "object"
4281+
required:
4282+
- "api_key"
4283+
- "query"
4284+
additionalProperties: true
4285+
properties:
4286+
api_key:
4287+
type: "string"
4288+
title: "API Key"
4289+
description: "API Key"
4290+
order: 0
4291+
airbyte_secret: true
4292+
query:
4293+
type: "string"
4294+
order: 1
4295+
title: "Query"
4296+
description: "This parameter allows you to specify your search keywords\
4297+
\ to find the news articles you are looking for. The keywords will be\
4298+
\ used to return the most relevant articles. It is possible to use logical\
4299+
\ operators with keywords. - Phrase Search Operator: This operator allows\
4300+
\ you to make an exact search. Keywords surrounded by \n quotation marks\
4301+
\ are used to search for articles with the exact same keyword sequence.\
4302+
\ \n For example the query: \"Apple iPhone\" will return articles matching\
4303+
\ at least once this sequence of keywords.\n- Logical AND Operator: This\
4304+
\ operator allows you to make sure that several keywords are all used\
4305+
\ in the article\n search. By default the space character acts as an\
4306+
\ AND operator, it is possible to replace the space character \n by AND\
4307+
\ to obtain the same result. For example the query: Apple Microsoft is\
4308+
\ equivalent to Apple AND Microsoft\n- Logical OR Operator: This operator\
4309+
\ allows you to retrieve articles matching the keyword a or the keyword\
4310+
\ b.\n It is important to note that this operator has a higher precedence\
4311+
\ than the AND operator. For example the \n query: Apple OR Microsoft\
4312+
\ will return all articles matching the keyword Apple as well as all articles\
4313+
\ matching \n the keyword Microsoft\n- Logical NOT Operator: This operator\
4314+
\ allows you to remove from the results the articles corresponding to\
4315+
\ the\n specified keywords. To use it, you need to add NOT in front of\
4316+
\ each word or phrase surrounded by quotes.\n For example the query:\
4317+
\ Apple NOT iPhone will return all articles matching the keyword Apple\
4318+
\ but not the keyword\n iPhone"
4319+
examples:
4320+
- "Microsoft Windows 10"
4321+
- "Apple OR Microsoft"
4322+
- "Apple AND NOT iPhone"
4323+
- "(Windows 7) AND (Windows 10)"
4324+
- "Intel AND (i7 OR i9)"
4325+
language:
4326+
type: "string"
4327+
title: "Language"
4328+
decription: "This parameter allows you to specify the language of the news\
4329+
\ articles returned by the API. You have to set as value the 2 letters\
4330+
\ code of the language you want to filter."
4331+
order: 2
4332+
enum:
4333+
- "ar"
4334+
- "zh"
4335+
- "nl"
4336+
- "en"
4337+
- "fr"
4338+
- "de"
4339+
- "el"
4340+
- "he"
4341+
- "hi"
4342+
- "it"
4343+
- "ja"
4344+
- "ml"
4345+
- "mr"
4346+
- "no"
4347+
- "pt"
4348+
- "ro"
4349+
- "ru"
4350+
- "es"
4351+
- "sv"
4352+
- "ta"
4353+
- "te"
4354+
- "uk"
4355+
country:
4356+
type: "string"
4357+
title: "Country"
4358+
description: "This parameter allows you to specify the country where the\
4359+
\ news articles returned by the API were published, the contents of the\
4360+
\ articles are not necessarily related to the specified country. You have\
4361+
\ to set as value the 2 letters code of the country you want to filter."
4362+
order: 3
4363+
enum:
4364+
- "au"
4365+
- "br"
4366+
- "ca"
4367+
- "cn"
4368+
- "eg"
4369+
- "fr"
4370+
- "de"
4371+
- "gr"
4372+
- "hk"
4373+
- "in"
4374+
- "ie"
4375+
- "il"
4376+
- "it"
4377+
- "jp"
4378+
- "nl"
4379+
- "no"
4380+
- "pk"
4381+
- "pe"
4382+
- "ph"
4383+
- "pt"
4384+
- "ro"
4385+
- "ru"
4386+
- "sg"
4387+
- "es"
4388+
- "se"
4389+
- "ch"
4390+
- "tw"
4391+
- "ua"
4392+
- "gb"
4393+
- "us"
4394+
in:
4395+
type: "array"
4396+
title: "In"
4397+
description: "This parameter allows you to choose in which attributes the\
4398+
\ keywords are searched. The attributes that can be set are title, description\
4399+
\ and content. It is possible to combine several attributes."
4400+
order: 4
4401+
items:
4402+
type: "string"
4403+
enum:
4404+
- "title"
4405+
- "description"
4406+
- "content"
4407+
nullable:
4408+
type: "array"
4409+
title: "Nullable"
4410+
description: "This parameter allows you to specify the attributes that you\
4411+
\ allow to return null values. The attributes that can be set are title,\
4412+
\ description and content. It is possible to combine several attributes"
4413+
order: 5
4414+
items:
4415+
type: "string"
4416+
enum:
4417+
- "title"
4418+
- "description"
4419+
- "content"
4420+
start_date:
4421+
type: "string"
4422+
title: "Start Date"
4423+
description: "This parameter allows you to filter the articles that have\
4424+
\ a publication date greater than or equal to the specified value. The\
4425+
\ date must respect the following format: YYYY-MM-DD hh:mm:ss (in UTC)"
4426+
order: 6
4427+
pattern: "^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}$"
4428+
examples:
4429+
- "2022-08-21 16:27:09"
4430+
end_date:
4431+
type: "string"
4432+
title: "End Date"
4433+
description: "This parameter allows you to filter the articles that have\
4434+
\ a publication date smaller than or equal to the specified value. The\
4435+
\ date must respect the following format: YYYY-MM-DD hh:mm:ss (in UTC)"
4436+
order: 6
4437+
pattern: "^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}$"
4438+
examples:
4439+
- "2022-08-21 16:27:09"
4440+
sortby:
4441+
type: "string"
4442+
title: "Sort By"
4443+
description: "This parameter allows you to choose with which type of sorting\
4444+
\ the articles should be returned. Two values are possible:\n - publishedAt\
4445+
\ = sort by publication date, the articles with the most recent publication\
4446+
\ date are returned first\n - relevance = sort by best match to keywords,\
4447+
\ the articles with the best match are returned first"
4448+
order: 7
4449+
enum:
4450+
- "publishedAt"
4451+
- "relevance"
4452+
top_headlines_query:
4453+
type: "string"
4454+
order: 8
4455+
title: "Top Headlines Query"
4456+
description: "This parameter allows you to specify your search keywords\
4457+
\ to find the news articles you are looking for. The keywords will be\
4458+
\ used to return the most relevant articles. It is possible to use logical\
4459+
\ operators with keywords. - Phrase Search Operator: This operator allows\
4460+
\ you to make an exact search. Keywords surrounded by \n quotation marks\
4461+
\ are used to search for articles with the exact same keyword sequence.\
4462+
\ \n For example the query: \"Apple iPhone\" will return articles matching\
4463+
\ at least once this sequence of keywords.\n- Logical AND Operator: This\
4464+
\ operator allows you to make sure that several keywords are all used\
4465+
\ in the article\n search. By default the space character acts as an\
4466+
\ AND operator, it is possible to replace the space character \n by AND\
4467+
\ to obtain the same result. For example the query: Apple Microsoft is\
4468+
\ equivalent to Apple AND Microsoft\n- Logical OR Operator: This operator\
4469+
\ allows you to retrieve articles matching the keyword a or the keyword\
4470+
\ b.\n It is important to note that this operator has a higher precedence\
4471+
\ than the AND operator. For example the \n query: Apple OR Microsoft\
4472+
\ will return all articles matching the keyword Apple as well as all articles\
4473+
\ matching \n the keyword Microsoft\n- Logical NOT Operator: This operator\
4474+
\ allows you to remove from the results the articles corresponding to\
4475+
\ the\n specified keywords. To use it, you need to add NOT in front of\
4476+
\ each word or phrase surrounded by quotes.\n For example the query:\
4477+
\ Apple NOT iPhone will return all articles matching the keyword Apple\
4478+
\ but not the keyword\n iPhone"
4479+
examples:
4480+
- "Microsoft Windows 10"
4481+
- "Apple OR Microsoft"
4482+
- "Apple AND NOT iPhone"
4483+
- "(Windows 7) AND (Windows 10)"
4484+
- "Intel AND (i7 OR i9)"
4485+
top_headlines_topic:
4486+
type: "string"
4487+
title: "Top Headlines Topic"
4488+
description: "This parameter allows you to change the category for the request."
4489+
order: 9
4490+
enum:
4491+
- "breaking-news"
4492+
- "world"
4493+
- "nation"
4494+
- "business"
4495+
- "technology"
4496+
- "entertainment"
4497+
- "sports"
4498+
- "science"
4499+
- "health"
4500+
supportsNormalization: false
4501+
supportsDBT: false
4502+
supported_destination_sync_modes: []
42744503
- dockerImage: "airbyte/source-gocardless:0.1.0"
42754504
spec:
42764505
documentationUrl: "https://docs.airbyte.com/integrations/sources/gocardless"
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
*
2+
!Dockerfile
3+
!main.py
4+
!source_gnews
5+
!setup.py
6+
!secrets
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
FROM python:3.9.11-alpine3.15 as base
2+
3+
# build and load all requirements
4+
FROM base as builder
5+
WORKDIR /airbyte/integration_code
6+
7+
# upgrade pip to the latest version
8+
RUN apk --no-cache upgrade \
9+
&& pip install --upgrade pip \
10+
&& apk --no-cache add tzdata build-base
11+
12+
13+
COPY setup.py ./
14+
# install necessary packages to a temporary folder
15+
RUN pip install --prefix=/install .
16+
17+
# build a clean environment
18+
FROM base
19+
WORKDIR /airbyte/integration_code
20+
21+
# copy all loaded and built libraries to a pure basic image
22+
COPY --from=builder /install /usr/local
23+
# add default timezone settings
24+
COPY --from=builder /usr/share/zoneinfo/Etc/UTC /etc/localtime
25+
RUN echo "Etc/UTC" > /etc/timezone
26+
27+
# bash is installed for more convenient debugging.
28+
RUN apk --no-cache add bash
29+
30+
# copy payload code only
31+
COPY main.py ./
32+
COPY source_gnews ./source_gnews
33+
34+
ENV AIRBYTE_ENTRYPOINT "python /airbyte/integration_code/main.py"
35+
ENTRYPOINT ["python", "/airbyte/integration_code/main.py"]
36+
37+
LABEL io.airbyte.version=0.1.0
38+
LABEL io.airbyte.name=airbyte/source-gnews
Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
# Gnews Source
2+
3+
This is the repository for the Gnews configuration based source connector.
4+
For information about how to use this connector within Airbyte, see [the documentation](https://docs.airbyte.io/integrations/sources/gnews).
5+
6+
## Local development
7+
8+
#### Building via Gradle
9+
You can also build the connector in Gradle. This is typically used in CI and not needed for your development workflow.
10+
11+
To build using Gradle, from the Airbyte repository root, run:
12+
```
13+
./gradlew :airbyte-integrations:connectors:source-gnews:build
14+
```
15+
16+
#### Create credentials
17+
**If you are a community contributor**, follow the instructions in the [documentation](https://docs.airbyte.io/integrations/sources/gnews)
18+
to generate the necessary credentials. Then create a file `secrets/config.json` conforming to the `source_gnews/spec.yaml` file.
19+
Note that any directory named `secrets` is gitignored across the entire Airbyte repo, so there is no danger of accidentally checking in sensitive information.
20+
See `integration_tests/sample_config.json` for a sample config file.
21+
22+
**If you are an Airbyte core member**, copy the credentials in Lastpass under the secret name `source gnews test creds`
23+
and place them into `secrets/config.json`.
24+
25+
### Locally running the connector docker image
26+
27+
#### Build
28+
First, make sure you build the latest Docker image:
29+
```
30+
docker build . -t airbyte/source-gnews:dev
31+
```
32+
33+
You can also build the connector image via Gradle:
34+
```
35+
./gradlew :airbyte-integrations:connectors:source-gnews:airbyteDocker
36+
```
37+
When building via Gradle, the docker image name and tag, respectively, are the values of the `io.airbyte.name` and `io.airbyte.version` `LABEL`s in
38+
the Dockerfile.
39+
40+
#### Run
41+
Then run any of the connector commands as follows:
42+
```
43+
docker run --rm airbyte/source-gnews:dev spec
44+
docker run --rm -v $(pwd)/secrets:/secrets airbyte/source-gnews:dev check --config /secrets/config.json
45+
docker run --rm -v $(pwd)/secrets:/secrets airbyte/source-gnews:dev discover --config /secrets/config.json
46+
docker run --rm -v $(pwd)/secrets:/secrets -v $(pwd)/integration_tests:/integration_tests airbyte/source-gnews:dev read --config /secrets/config.json --catalog /integration_tests/configured_catalog.json
47+
```
48+
## Testing
49+
50+
#### Acceptance Tests
51+
Customize `acceptance-test-config.yml` file to configure tests. See [Source Acceptance Tests](https://docs.airbyte.io/connector-development/testing-connectors/source-acceptance-tests-reference) for more information.
52+
If your connector requires to create or destroy resources for use during acceptance tests create fixtures for it and place them inside integration_tests/acceptance.py.
53+
54+
To run your integration tests with docker
55+
56+
### Using gradle to run tests
57+
All commands should be run from airbyte project root.
58+
To run unit tests:
59+
```
60+
./gradlew :airbyte-integrations:connectors:source-gnews:unitTest
61+
```
62+
To run acceptance and custom integration tests:
63+
```
64+
./gradlew :airbyte-integrations:connectors:source-gnews:integrationTest
65+
```
66+
67+
## Dependency Management
68+
All of your dependencies should go in `setup.py`, NOT `requirements.txt`. The requirements file is only used to connect internal Airbyte dependencies in the monorepo for local development.
69+
We split dependencies between two groups, dependencies that are:
70+
* required for your connector to work need to go to `MAIN_REQUIREMENTS` list.
71+
* required for the testing need to go to `TEST_REQUIREMENTS` list
72+
73+
### Publishing a new version of the connector
74+
You've checked out the repo, implemented a million dollar feature, and you're ready to share your changes with the world. Now what?
75+
1. Make sure your changes are passing unit and integration tests.
76+
1. Bump the connector version in `Dockerfile` -- just increment the value of the `LABEL io.airbyte.version` appropriately (we use [SemVer](https://semver.org/)).
77+
1. Create a Pull Request.
78+
1. Pat yourself on the back for being an awesome contributor.
79+
1. Someone from Airbyte will take a look at your PR and iterate with you to merge it into master.
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
#
2+
# Copyright (c) 2022 Airbyte, Inc., all rights reserved.
3+
#

0 commit comments

Comments
 (0)