Skip to content

Commit df67b36

Browse files
authored
🐛Source Rss: Fix Incremental Sync (#37535)
1 parent 9eae446 commit df67b36

File tree

10 files changed

+68
-20
lines changed

10 files changed

+68
-20
lines changed

airbyte-integrations/connectors/source-rss/README.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ poetry install --with dev
2525
**If you are a community contributor**, follow the instructions in the [documentation](https://docs.airbyte.com/integrations/sources/rss)
2626
to generate the necessary credentials. Then create a file `secrets/config.json` conforming to the `src/source_rss/spec.yaml` file.
2727
Note that any directory named `secrets` is gitignored across the entire Airbyte repo, so there is no danger of accidentally checking in sensitive information.
28-
See `sample_files/sample_config.json` for a sample config file.
28+
See `integration_tests/sample_config.json` for a sample config file.
2929

3030

3131
### Locally running the connector
@@ -34,7 +34,7 @@ See `sample_files/sample_config.json` for a sample config file.
3434
poetry run source-rss spec
3535
poetry run source-rss check --config secrets/config.json
3636
poetry run source-rss discover --config secrets/config.json
37-
poetry run source-rss read --config secrets/config.json --catalog sample_files/configured_catalog.json
37+
poetry run source-rss read --config secrets/config.json --catalog integration_tests/configured_catalog.json
3838
```
3939

4040
### Running tests
@@ -100,4 +100,4 @@ You've checked out the repo, implemented a million dollar feature, and you're re
100100
5. Create a Pull Request: use [our PR naming conventions](https://docs.airbyte.com/contributing-to-airbyte/resources/pull-requests-handbook/#pull-request-title-convention).
101101
6. Pat yourself on the back for being an awesome contributor.
102102
7. Someone from Airbyte will take a look at your PR and iterate with you to merge it into master.
103-
8. Once your PR is merged, the new version of the connector will be automatically published to Docker Hub and our connector registry.
103+
8. Once your PR is merged, the new version of the connector will be automatically published to Docker Hub and our connector registry.

airbyte-integrations/connectors/source-rss/acceptance-test-config.yml

+5-6
Original file line numberDiff line numberDiff line change
@@ -20,12 +20,11 @@ acceptance_tests:
2020
configured_catalog_path: "integration_tests/configured_catalog.json"
2121
empty_streams: []
2222
incremental:
23-
bypass_reason: "This connector does not implement incremental sync"
24-
# tests:
25-
# - config_path: "secrets/config.json"
26-
# configured_catalog_path: "integration_tests/configured_catalog.json"
27-
# future_state:
28-
# future_state_path: "integration_tests/abnormal_state.json"
23+
tests:
24+
- config_path: "secrets/config.json"
25+
configured_catalog_path: "integration_tests/configured_catalog.json"
26+
future_state:
27+
future_state_path: "integration_tests/abnormal_state.json"
2928
full_refresh:
3029
tests:
3130
- config_path: "secrets/config.json"
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
{
2-
"todo-stream-name": {
3-
"todo-field-name": "value"
2+
"items": {
3+
"published": "3333-10-24T16:16:00+00:00"
44
}
55
}

airbyte-integrations/connectors/source-rss/integration_tests/configured_catalog.json

+36-2
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,42 @@
33
{
44
"stream": {
55
"name": "items",
6-
"json_schema": {},
7-
"supported_sync_modes": ["full_refresh"]
6+
"json_schema": {
7+
"$schema": "http://json-schema.org/draft-07/schema#",
8+
"type": "object",
9+
"required": ["published"],
10+
"properties": {
11+
"title": {
12+
"type": ["null", "string"]
13+
},
14+
"link": {
15+
"type": ["null", "string"]
16+
},
17+
"description": {
18+
"type": ["null", "string"]
19+
},
20+
"author": {
21+
"type": ["null", "string"]
22+
},
23+
"category": {
24+
"type": ["null", "string"]
25+
},
26+
"comments": {
27+
"type": ["null", "string"]
28+
},
29+
"enclosure": {
30+
"type": ["null", "string"]
31+
},
32+
"guid": {
33+
"type": ["null", "string"]
34+
},
35+
"published": {
36+
"type": ["string"],
37+
"format": "date-time"
38+
}
39+
}
40+
},
41+
"supported_sync_modes": ["full_refresh", "incremental"]
842
},
943
"sync_mode": "full_refresh",
1044
"destination_sync_mode": "overwrite"
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
{
2-
"todo-stream-name": {
3-
"todo-field-name": "value"
2+
"items": {
3+
"published": "2022-10-24T16:16:00+00:00"
44
}
55
}

airbyte-integrations/connectors/source-rss/metadata.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ data:
2424
connectorSubtype: api
2525
connectorType: source
2626
definitionId: 0efee448-6948-49e2-b786-17db50647908
27-
dockerImageTag: 1.0.0
27+
dockerImageTag: 1.0.1
2828
dockerRepository: airbyte/source-rss
2929
githubIssueLabel: source-rss
3030
icon: rss.svg

airbyte-integrations/connectors/source-rss/pyproject.toml

+1-1
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ requires = [ "poetry-core>=1.0.0",]
33
build-backend = "poetry.core.masonry.api"
44

55
[tool.poetry]
6-
version = "1.0.0"
6+
version = "1.0.1"
77
name = "source-rss"
88
description = "Source implementation for rss."
99
authors = [ "Airbyte <[email protected]>",]

airbyte-integrations/connectors/source-rss/source_rss/manifest.yaml

+15-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
version: "0.44.0"
1+
version: "0.78.5"
22

33
definitions:
44
selector:
@@ -17,6 +17,8 @@ definitions:
1717
type: SimpleRetriever
1818
record_selector:
1919
$ref: "#/definitions/selector"
20+
record_filter:
21+
condition: "{{ record['published'] >= stream_interval['start_time'] }}"
2022
paginator:
2123
type: NoPagination
2224
requester:
@@ -36,6 +38,18 @@ definitions:
3638
$ref: "#/definitions/items_schema"
3739
$parameters:
3840
path: "/"
41+
incremental_sync:
42+
type: DatetimeBasedCursor
43+
cursor_field: published
44+
datetime_format: "%Y-%m-%dT%H:%M:%S%z"
45+
start_datetime:
46+
type: MinMaxDatetime
47+
datetime: "{{ (now_utc() - duration('PT23H')).strftime('%Y-%m-%dT%H:%M:%S%z') }}"
48+
datetime_format: "%Y-%m-%dT%H:%M:%S%z"
49+
end_datetime:
50+
type: MinMaxDatetime
51+
datetime: "{{ now_utc().strftime('%Y-%m-%dT%H:%M:%S%z') }}"
52+
datetime_format: "%Y-%m-%dT%H:%M:%S%z"
3953

4054
items_schema:
4155
$schema: http://json-schema.org/draft-07/schema#

docs/integrations/sources/rss-migrations.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# RSS Migration Guide
1+
# Rss Migration Guide
22

33
## Upgrading to 1.0.0
44
We're continuously striving to enhance the quality and reliability of our connectors at Airbyte.
@@ -18,4 +18,4 @@ Clearing your data is required for the affected streams in order to continue syn
1818
1. Ensure the **Clear affected streams** option is checked to ensure your streams continue syncing successfully with the new schema.
1919
4. Select **Save connection**.
2020

21-
This will clear the data in your destination for the subset of streams with schema changes. After the clear succeeds, trigger a sync by clicking **Sync Now**. For more information on clearing your data in Airbyte, see [this page](https://docs.airbyte.com/operator-guides/reset).
21+
This will clear the data in your destination for the subset of streams with schema changes. After the clear succeeds, trigger a sync by clicking **Sync Now**. For more information on clearing your data in Airbyte, see [this page](https://docs.airbyte.com/operator-guides/reset).

docs/integrations/sources/rss.md

+1
Original file line numberDiff line numberDiff line change
@@ -34,5 +34,6 @@ None
3434

3535
| Version | Date | Pull Request | Subject |
3636
| :------ | :---------- | :------------------------------------------------------- | :----------------------------- |
37+
| 1.0.1 | 2024-04-30 | [37535](https://github.com/airbytehq/airbyte/pull/37535) | Fix incremental sync |
3738
| 1.0.0 | 2024-04-20 | [36418](https://github.com/airbytehq/airbyte/pull/36418) | Migrate python cdk to low code |
3839
| 0.1.0 | 2022-10-12 | [18838](https://github.com/airbytehq/airbyte/pull/18838) | Initial release supporting RSS |

0 commit comments

Comments
 (0)