Skip to content

🎉 New Source: RSS [Python CDK] #18838

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Nov 14, 2022

Conversation

jrhizor
Copy link
Contributor

@jrhizor jrhizor commented Nov 2, 2022

resolves airbytehq/connector-contest#218 (+$500)

Pre-merge Checklist

Expand the relevant checklist and delete the others.

New Connector

Community member or Airbyter

  • Community member? Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
    • docs/integrations/README.md
    • airbyte-integrations/builds.md
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • If new credentials are required for use in CI, add them to GSM. Instructions.
  • /test connector=connectors/<name> command is passing
  • New Connector version released on Dockerhub by running the /publish command described here
  • After the connector is published, connector added to connector index as described here
  • Seed specs have been re-generated by building the platform and committing the changes to the seed spec files, as described here

Tests

Unit
❯ python -m pytest unit_tests
Test session starts (platform: darwin, Python 3.10.6, pytest 6.2.5, pytest-sugar 0.9.5)
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/Users/jrhzor/code/airbyte/airbyte-integrations/connectors/source-rss/.hypothesis/examples')
rootdir: /Users/jrhzor/code/airbyte, configfile: pytest.ini
plugins: hypothesis-6.54.6, sugar-0.9.5, requests-mock-1.9.3, mock-3.6.1, timeout-1.4.2, cov-3.0.0
collecting ...
 airbyte-integrations/connectors/source-rss/unit_tests/test_incremental_streams.py::test_cursor_field ✓                                                                       6% ▋
 airbyte-integrations/connectors/source-rss/unit_tests/test_incremental_streams.py::test_get_updated_state ✓                                                                 12% █▎
 airbyte-integrations/connectors/source-rss/unit_tests/test_incremental_streams.py::test_stream_slices ✓                                                                     18% █▊
 airbyte-integrations/connectors/source-rss/unit_tests/test_incremental_streams.py::test_supports_incremental ✓                                                              24% ██▍
 airbyte-integrations/connectors/source-rss/unit_tests/test_incremental_streams.py::test_source_defined_cursor ✓                                                             29% ██▉
 airbyte-integrations/connectors/source-rss/unit_tests/test_incremental_streams.py::test_stream_checkpoint_interval ✓                                                        35% ███▌
 airbyte-integrations/connectors/source-rss/unit_tests/test_source.py::test_streams ✓                                                                                        41% ████▎
 airbyte-integrations/connectors/source-rss/unit_tests/test_streams.py::test_request_params ✓                                                                                47% ████▊
 airbyte-integrations/connectors/source-rss/unit_tests/test_streams.py::test_next_page_token ✓                                                                               53% █████▍
 airbyte-integrations/connectors/source-rss/unit_tests/test_streams.py::test_parse_response ✓                                                                                59% █████▉
 airbyte-integrations/connectors/source-rss/unit_tests/test_streams.py::test_request_headers ✓                                                                               65% ██████▌
 airbyte-integrations/connectors/source-rss/unit_tests/test_streams.py::test_http_method ✓                                                                                   71% ███████▏
 airbyte-integrations/connectors/source-rss/unit_tests/test_streams.py::test_should_retry[HTTPStatus.OK-False] ✓                                                             76% ███████▋
 airbyte-integrations/connectors/source-rss/unit_tests/test_streams.py::test_should_retry[HTTPStatus.BAD_REQUEST-False] ✓                                                    82% ████████▎
 airbyte-integrations/connectors/source-rss/unit_tests/test_streams.py::test_should_retry[HTTPStatus.TOO_MANY_REQUESTS-True] ✓                                               88% ████████▉
 airbyte-integrations/connectors/source-rss/unit_tests/test_streams.py::test_should_retry[HTTPStatus.INTERNAL_SERVER_ERROR-True] ✓                                           94% █████████▌
 airbyte-integrations/connectors/source-rss/unit_tests/test_streams.py::test_backoff_time ✓                                                                                 100% ██████████
==================================================================================== warnings summary =====================================================================================
airbyte-integrations/connectors/source-rss/unit_tests/test_incremental_streams.py: 6 warnings
airbyte-integrations/connectors/source-rss/unit_tests/test_source.py: 1 warning
airbyte-integrations/connectors/source-rss/unit_tests/test_streams.py: 10 warnings
  /Users/jrhzor/code/airbyte/airbyte-integrations/connectors/source-rss/.venv/lib/python3.10/site-packages/airbyte_cdk/sources/streams/http/http.py:43: DeprecationWarning: Call to deprecated class NoAuth. (Set `authenticator=None` instead) -- Deprecated since version 0.1.20.
    self._authenticator: HttpAuthenticator = NoAuth()

airbyte-integrations/connectors/source-rss/unit_tests/test_incremental_streams.py: 6 warnings
airbyte-integrations/connectors/source-rss/unit_tests/test_source.py: 1 warning
airbyte-integrations/connectors/source-rss/unit_tests/test_streams.py: 10 warnings
  /Users/jrhzor/code/airbyte/airbyte-integrations/connectors/source-rss/.venv/lib/python3.10/site-packages/deprecated/classic.py:173: DeprecationWarning: Call to deprecated class HttpAuthenticator. (Use requests.auth.AuthBase instead) -- Deprecated since version 0.1.20.
    return old_new1(cls, *args, **kwargs)

-- Docs: https://docs.pytest.org/en/stable/warnings.html

Results (0.36s):
      17 passed
Integration
❯ python -m pytest integration_tests
Test session starts (platform: darwin, Python 3.10.6, pytest 6.2.5, pytest-sugar 0.9.5)
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/Users/jrhzor/code/airbyte/airbyte-integrations/connectors/source-rss/.hypothesis/examples')
rootdir: /Users/jrhzor/code/airbyte, configfile: pytest.ini
plugins: hypothesis-6.54.6, sugar-0.9.5, requests-mock-1.9.3, mock-3.6.1, timeout-1.4.2, cov-3.0.0
collecting ...

Results (0.18s):
Acceptance
❯ python -m pytest integration_tests -p integration_tests.acceptance
Test session starts (platform: darwin, Python 3.10.6, pytest 6.2.5, pytest-sugar 0.9.5)
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/Users/jrhzor/code/airbyte/airbyte-integrations/connectors/source-rss/.hypothesis/examples')
rootdir: /Users/jrhzor/code/airbyte, configfile: pytest.ini
plugins: hypothesis-6.54.6, sugar-0.9.5, requests-mock-1.9.3, mock-3.6.1, timeout-1.4.2, cov-3.0.0
collecting ...
 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestSpec.test_config_match_spec[inputs0] ✓                                      3% ▍
 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestSpec.test_match_expected[inputs0] ✓                                         7% ▊
 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestSpec.test_docker_env[inputs0] ✓                                            10% █
 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestSpec.test_oneof_usage[inputs0] ✓                                           14% █▍
 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestSpec.test_required[inputs0] ✓                                              17% █▊
 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestSpec.test_optional[inputs0] ✓                                              21% ██▏
 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestSpec.test_has_secret[inputs0] ✓                                            24% ██▌
 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestSpec.test_secret_never_in_the_output[inputs0] ✓                            28% ██▊
 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestSpec.test_defined_refs_exist_in_json_spec_file[inputs0] ✓                  31% ███▏
 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestSpec.test_oauth_flow_parameters[inputs0] ✓                                 34% ███▌      Pulling docker image airbyte/source-rss:latest
{"type": "LOG", "log": {"level": "WARN", "message": "\n We did not find the airbyte/source-rss:latest image for this connector. This probably means this version has not yet been published to an accessible docker registry like DockerHub."}}

 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestSpec.test_backward_compatibility[inputs0] s                                38% ███▊
 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestSpec.test_additional_properties_is_true[inputs0] ✓                         41% ████▎
 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestConnection.test_check[inputs0] ✓                                           45% ████▌
 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestConnection.test_check[inputs1] ✓                                           48% ████▉
 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestDiscovery.test_discover[inputs0] ✓                                         52% █████▎
 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestDiscovery.test_defined_cursors_exist_in_schema[inputs0] ✓                  55% █████▌
 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestDiscovery.test_defined_refs_exist_in_schema[inputs0] ✓                     59% █████▉
 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestDiscovery.test_defined_keyword_exist_in_schema[inputs0-allOf] ✓            62% ██████▎
 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestDiscovery.test_defined_keyword_exist_in_schema[inputs0-not] ✓              66% ██████▋
 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestDiscovery.test_primary_keys_exist_in_schema[inputs0] ✓                     69% ██████▉
 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestDiscovery.test_streams_has_sync_modes[inputs0] ✓                           72% ███████▎
 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestDiscovery.test_additional_properties_is_true[inputs0] ✓                    76% ███████▋  Pulling docker image airbyte/source-rss:latest
{"type": "LOG", "log": {"level": "WARN", "message": "\n We did not find the airbyte/source-rss:latest image for this connector. This probably means this version has not yet been published to an accessible docker registry like DockerHub."}}

 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestDiscovery.test_backward_compatibility[inputs0] s                           79% ███████▉
 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestBasicRead.test_read[inputs0] ✓                                             83% ████████▍ {"type": "LOG", "log": {"level": "ERROR", "message": "Docker container failed, code 1, error:\n{\"type\": \"TRACE\", \"trace\": {\"type\": \"ERROR\", \"emitted_at\": 1667371201165.666, \"error\": {\"message\": \"Something went wrong in the connector. See the logs for more details.\", \"internal_message\": \"2 validation errors for ConfiguredAirbyteCatalog\\nstreams -> 0 -> sync_mode\\n  value is not a valid enumeration member; permitted: 'full_refresh', 'incremental' (type=type_error.enum; enum_values=[<SyncMode.full_refresh: 'full_refresh'>, <SyncMode.incremental: 'incremental'>])\\nstreams -> 0 -> destination_sync_mode\\n  value is not a valid enumeration member; permitted: 'append', 'overwrite', 'append_dedup' (type=type_error.enum; enum_values=[<DestinationSyncMode.append: 'append'>, <DestinationSyncMode.overwrite: 'overwrite'>, <DestinationSyncMode.append_dedup: 'append_dedup'>])\", \"stack_trace\": \"Traceback (most recent call last):\\n  File \\\"/airbyte/integration_code/main.py\\\", line 13, in <module>\\n    launch(source, sys.argv[1:])\\n  File \\\"/usr/local/lib/python3.9/site-packages/airbyte_cdk/entrypoint.py\\\", line 131, in launch\\n    for message in source_entrypoint.run(parsed_args):\\n  File \\\"/usr/local/lib/python3.9/site-packages/airbyte_cdk/entrypoint.py\\\", line 119, in run\\n    config_catalog = self.source.read_catalog(parsed_args.catalog)\\n  File \\\"/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/source.py\\\", line 90, in read_catalog\\n    return ConfiguredAirbyteCatalog.parse_obj(self.read_config(catalog_path))\\n  File \\\"/usr/local/lib/python3.9/site-packages/pydantic/main.py\\\", line 521, in parse_obj\\n    return cls(**obj)\\n  File \\\"/usr/local/lib/python3.9/site-packages/pydantic/main.py\\\", line 341, in __init__\\n    raise validation_error\\npydantic.error_wrappers.ValidationError: 2 validation errors for ConfiguredAirbyteCatalog\\nstreams -> 0 -> sync_mode\\n  value is not a valid enumeration member; permitted: 'full_refresh', 'incremental' (type=type_error.enum; enum_values=[<SyncMode.full_refresh: 'full_refresh'>, <SyncMode.incremental: 'incremental'>])\\nstreams -> 0 -> destination_sync_mode\\n  value is not a valid enumeration member; permitted: 'append', 'overwrite', 'append_dedup' (type=type_error.enum; enum_values=[<DestinationSyncMode.append: 'append'>, <DestinationSyncMode.overwrite: 'overwrite'>, <DestinationSyncMode.append_dedup: 'append_dedup'>])\\n\", \"failure_type\": \"system_error\"}}}\n"}}

 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestBasicRead.test_airbyte_trace_message_on_failure[inputs0] ✓                 86% ████████▋
 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_full_refresh.py::TestFullRefresh.test_sequential_reads[inputs0] ✓                       90% █████████
 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_incremental.py::TestIncremental.test_two_sequential_reads[inputs0] ✓                    93% █████████▍
 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_incremental.py::TestIncremental.test_read_sequential_slices[inputs0] ✓                  97% █████████▋
 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_incremental.py::TestIncremental.test_state_with_abnormally_large_values[inputs0] ✓     100% ██████████
{"type": "LOG", "log": {"level": "INFO", "message": "/Users/jrhzor/code/airbyte/airbyte-integrations/connectors/source-rss - SAT run - 75c7503c96fa9789a5fd98b7b023108ac1a2fd8b - PASSED"}}

==================================================================================== warnings summary =====================================================================================
airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py: 22 warnings
airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_full_refresh.py: 1 warning
airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_incremental.py: 3 warnings
  /Users/jrhzor/code/airbyte/airbyte-integrations/connectors/source-rss/.venv/lib/python3.10/site-packages/docker/utils/utils.py:52: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
    s1 = StrictVersion(v1)

airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py: 22 warnings
airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_full_refresh.py: 1 warning
airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_incremental.py: 3 warnings
  /Users/jrhzor/code/airbyte/airbyte-integrations/connectors/source-rss/.venv/lib/python3.10/site-packages/docker/utils/utils.py:53: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
    s2 = StrictVersion(v2)

-- Docs: https://docs.pytest.org/en/stable/warnings.html
================================================================================= short test summary info =================================================================================
SKIPPED [1] ../../bases/source-acceptance-test/source_acceptance_test/tests/test_core.py:51: The previous connector image could not be retrieved.
SKIPPED [1] ../../bases/source-acceptance-test/source_acceptance_test/tests/test_core.py:229: The previous connector image could not be retrieved.

Results (27.50s):
      27 passed
       2 skipped

@github-actions github-actions bot added the area/documentation Improvements or additions to documentation label Nov 2, 2022
@jrhizor
Copy link
Contributor Author

jrhizor commented Nov 2, 2022

Misc feedback:


Broken link in generated Python connector docs (doesn't have that many lines anymore and doesn't demonstrate what it's supposed to):

See https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-stripe/source_stripe/source.py#L232 for an example.

The generated cursor_field method says it should return a str but the generated example has an empty list []. This is also reflected in test_cursor_field which checks for list by default.


Is it intentional for the generated test to include a bypass reason when by default it includes the incremental test case? Maybe it should be commented out in the generated version so anyone generating an incremental test doesn't need to to try to figure out why a bypass check isn't working properly.


airbyte-cdk/python/airbyte_cdk/sources/abstract_source.py
181:        logger.info(f"Syncing stream: {stream_name} ")

nit: extra space at the end of the log. didn't want to make a whole CDK change for this.


There isn't a TODO anywhere to set the future / abnormal state. In the generated abnormal state file it only has "abnormal", so I entered an invalid state file at first instead of what is intended (a "maximal" state for your stream). Maybe this should be directly in a unit test with a TODO comment that makes it clear what the content should be?


The default acceptance-test-config.yml should generate:

acceptance_tests:
  spec:
    tests:
      - config_path: "secrets/config.json"
        spec_path: "source_rss/spec.yaml"

instead of

acceptance_tests:
  spec:
    tests:
      - spec_path: "source_rss/spec.yaml"

so it's obvious if you're using a custom config path across the rest of the file that you need to update it for this one too. Possible to debug, but it takes 5 min instead of being obvious.


nit: parced vs parsed spelling


At git commit 75c7503c96 I got:

 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestBasicRead.test_read[inputs0] ✓                                             83% ████████▍ {"type": "LOG", "log": {"level": "ERROR", "message": "Docker container failed, code 1, error:\n{\"type\": \"TRACE\", \"trace\": {\"type\": \"ERROR\", \"emitted_at\": 1667370530278.373, \"error\": {\"message\": \"Something went wrong in the connector. See the logs for more details.\", \"internal_message\": \"2 validation errors for ConfiguredAirbyteCatalog\\nstreams -> 0 -> sync_mode\\n  value is not a valid enumeration member; permitted: 'full_refresh', 'incremental' (type=type_error.enum; enum_values=[<SyncMode.full_refresh: 'full_refresh'>, <SyncMode.incremental: 'incremental'>])\\nstreams -> 0 -> destination_sync_mode\\n  value is not a valid enumeration member; permitted: 'append', 'overwrite', 'append_dedup' (type=type_error.enum; enum_values=[<DestinationSyncMode.append: 'append'>, <DestinationSyncMode.overwrite: 'overwrite'>, <DestinationSyncMode.append_dedup: 'append_dedup'>])\", \"stack_trace\": \"Traceback (most recent call last):\\n  File \\\"/airbyte/integration_code/main.py\\\", line 13, in <module>\\n    launch(source, sys.argv[1:])\\n  File \\\"/usr/local/lib/python3.9/site-packages/airbyte_cdk/entrypoint.py\\\", line 131, in launch\\n    for message in source_entrypoint.run(parsed_args):\\n  File \\\"/usr/local/lib/python3.9/site-packages/airbyte_cdk/entrypoint.py\\\", line 119, in run\\n    config_catalog = self.source.read_catalog(parsed_args.catalog)\\n  File \\\"/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/source.py\\\", line 90, in read_catalog\\n    return ConfiguredAirbyteCatalog.parse_obj(self.read_config(catalog_path))\\n  File \\\"/usr/local/lib/python3.9/site-packages/pydantic/main.py\\\", line 521, in parse_obj\\n    return cls(**obj)\\n  File \\\"/usr/local/lib/python3.9/site-packages/pydantic/main.py\\\", line 341, in __init__\\n    raise validation_error\\npydantic.error_wrappers.ValidationError: 2 validation errors for ConfiguredAirbyteCatalog\\nstreams -> 0 -> sync_mode\\n  value is not a valid enumeration member; permitted: 'full_refresh', 'incremental' (type=type_error.enum; enum_values=[<SyncMode.full_refresh: 'full_refresh'>, <SyncMode.incremental: 'incremental'>])\\nstreams -> 0 -> destination_sync_mode\\n  value is not a valid enumeration member; permitted: 'append', 'overwrite', 'append_dedup' (type=type_error.enum; enum_values=[<DestinationSyncMode.append: 'append'>, <DestinationSyncMode.overwrite: 'overwrite'>, <DestinationSyncMode.append_dedup: 'append_dedup'>])\\n\", \"failure_type\": \"system_error\"}}}\n"}}

This is because of the invalid config intentionally being generated as part of the test, but it's confusing as a part of test_read and the fact that it looks like an error but it's expected. Very confusing when you're just reading through outputs.


There were a couple of unique things about this source that didn't fit super well into the abstraction. First of all, it has to read the full page every time, so any incremental reads are "fake" in that they filter out records emitted previously w.r.t. the state object. This also required dynamic urls based off the config, which felt a bit clunky to inject (maybe there's a better way than what I did).


Another thing I ran into here is that date handling was pretty scary and would have been pretty easy to mess up, since I didn't realize that Python just dropped timezone info by default if you don't get the datetime in a specific format and then specify tzinfos before outputting the data as an iso8601 string so it actually includes the timestamps.

I imagine this a common class of potential bugs for integrations? Maybe if the JSON Schema for stream includes a datetime there should be a unit test that at least fails initially if there isn't timezone information. Then the dev could opt out if they really don't want it, but it feels like in the vast majority of cases it'd be a mistake not to have timezone info.


Overall I felt like building this went fairly smoothly though. Nothing was a huge time sink this time around.

@tuliren
Copy link
Contributor

tuliren commented Nov 2, 2022

Oh mine, Jared is still on Airbyte payroll.

@marcosmarxm
Copy link
Member

Hello thanks @jrhizor someone will review the contribution next week!

@marcosmarxm
Copy link
Member

Hello! I'm going to be out of the office this Friday and won't be able to review your contribution again today, I return next Monday. So far, most contributions look solid and are almost done to be approved. As said in Chris' comment all contributions made before 2-November are eligible to receive the prize and have 2 weeks to merge the contributions. But I ensure next week we're going to have your contribution merged. If you have questions about the implementation you can send them in #hacktoberfest-2022 in Airbyte's Slack.

Sorry the inconvenience and see you again next week, thank you so much for your contribution!

Copy link
Contributor

@vincentkoc vincentkoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jrhizor thanks for your PR, I'm one of the Airbyte maintainers and have reviewed your code. Only one small stream change and a comment to answer.

Your PR was initially failing integration tests so I made some additional changes to this PR to make things pass.

@vincentkoc
Copy link
Contributor

Build is passing, I will do the version bump and merge/approve once the additional TODO and the question is answered.

=========================== short test summary info ============================
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_core.py:51: The previous connector image could not be retrieved.
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_core.py:229: The previous connector image could not be retrieved.
======================== 27 passed, 2 skipped in 27.33s ========================

Deprecated Gradle features were used in this build, making it incompatible with Gradle 8.0.

You can use '--warning-mode all' to show the individual deprecation warnings and determine if they come from your own scripts or plugins.

See https://docs.gradle.org/7.4/userguide/command_line_interface.html#sec:command_line_warnings

Execution optimizations have been disabled for 1 invalid unit(s) of work during this build to ensure correctness.
Please consult deprecation warnings for more details.

BUILD SUCCESSFUL in 3m 11s
48 actionable tasks: 23 executed, 25 up-to-date

@vincentkoc
Copy link
Contributor

@marcosmarxm @sajarin we need to add a config to github actions CI/CD in order to pass this connector. I used the sample provided.

{
  "url": "https://www.nasa.gov/rss/dyn/breaking_news.rss"
}

@vincentkoc vincentkoc self-assigned this Nov 6, 2022
@vincentkoc
Copy link
Contributor

vincentkoc commented Nov 9, 2022

/test connector=connectors/source-rss

🕑 connectors/source-rss https://github.com/airbytehq/airbyte/actions/runs/3428541787
✅ connectors/source-rss https://github.com/airbytehq/airbyte/actions/runs/3428541787
Python tests coverage:

Name                     Stmts   Miss  Cover
--------------------------------------------
source_rss/__init__.py       2      0   100%
source_rss/source.py        91     26    71%
--------------------------------------------
TOTAL                       93     26    72%
	 Name                                                 Stmts   Miss  Cover   Missing
	 ----------------------------------------------------------------------------------
	 source_acceptance_test/base.py                          12      4    67%   16-19
	 source_acceptance_test/config.py                       133      3    98%   87, 93, 230
	 source_acceptance_test/conftest.py                     196     92    53%   35, 41-43, 48, 54, 60, 66, 72-74, 93, 98-100, 106-108, 114-115, 120-121, 126, 132, 141-150, 156-161, 176, 200, 231, 237, 243-248, 256-261, 269-282, 287-293, 300-311, 318-334
	 source_acceptance_test/plugin.py                        69     25    64%   22-23, 31, 36, 120-140, 144-148
	 source_acceptance_test/tests/test_core.py              345    110    68%   53, 64-72, 77-84, 88-89, 93-94, 178, 216-233, 242-250, 254-259, 265, 298-303, 341-348, 391-393, 396, 461-469, 481-484, 489, 545-546, 552, 555, 591-601, 614-639
	 source_acceptance_test/tests/test_incremental.py       145     20    86%   21-23, 29-31, 36-43, 48-61, 224
	 source_acceptance_test/utils/asserts.py                 37      2    95%   57-58
	 source_acceptance_test/utils/common.py                  94     10    89%   16-17, 32-38, 72, 75
	 source_acceptance_test/utils/compare.py                 62     23    63%   21-51, 68, 97-99
	 source_acceptance_test/utils/config_migration.py        23     23     0%   5-37
	 source_acceptance_test/utils/connector_runner.py       112     50    55%   23-26, 32, 36, 39-68, 71-73, 76-78, 81-83, 86-88, 91-93, 96-114, 148-150
	 source_acceptance_test/utils/json_schema_helper.py     105     13    88%   30-31, 38, 41, 65-68, 96, 120, 190-192
	 ----------------------------------------------------------------------------------
	 TOTAL                                                 1512    375    75%

Build Passed

Test summary info:

=========================== short test summary info ============================
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_core.py:65: The previous connector image could not be retrieved.
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_core.py:243: The previous connector image could not be retrieved.
======================== 27 passed, 2 skipped in 25.27s ========================

@vincentkoc
Copy link
Contributor

vincentkoc commented Nov 10, 2022

/publish connector=connectors/source-rss run-tests=false

🕑 Publishing the following connectors:
connectors/source-rss
https://github.com/airbytehq/airbyte/actions/runs/3440260231


Connector Did it publish? Were definitions generated?
connectors/source-rss

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

@marcosmarxm marcosmarxm merged commit dffbeb3 into airbytehq:master Nov 14, 2022
@jrhizor jrhizor deleted the jrhizor/rss-source branch November 17, 2022 01:21
akashkulk pushed a commit that referenced this pull request Dec 2, 2022
* add source-rss

* update tests

* add docs

* fix doc link

* changes to pass tests

* update catalog

* Update test_streams.py

* fix time zone issue

* update source def

* auto-bump connector version

Co-authored-by: Vincent Koc <[email protected]>
Co-authored-by: Octavia Squidington III <[email protected]>
Co-authored-by: Marcos Marx <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

RSS feed source
7 participants