-
Notifications
You must be signed in to change notification settings - Fork 4.6k
🎉 New Source: RSS [Python CDK] #18838
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Misc feedback: Broken link in generated Python connector docs (doesn't have that many lines anymore and doesn't demonstrate what it's supposed to):
The generated Is it intentional for the generated test to include a bypass reason when by default it includes the incremental test case? Maybe it should be commented out in the generated version so anyone generating an incremental test doesn't need to to try to figure out why a bypass check isn't working properly.
nit: extra space at the end of the log. didn't want to make a whole CDK change for this. There isn't a TODO anywhere to set the future / abnormal state. In the generated abnormal state file it only has "abnormal", so I entered an invalid state file at first instead of what is intended (a "maximal" state for your stream). Maybe this should be directly in a unit test with a TODO comment that makes it clear what the content should be? The default acceptance-test-config.yml should generate:
instead of
so it's obvious if you're using a custom config path across the rest of the file that you need to update it for this one too. Possible to debug, but it takes 5 min instead of being obvious. nit: At git commit
This is because of the invalid config intentionally being generated as part of the test, but it's confusing as a part of There were a couple of unique things about this source that didn't fit super well into the abstraction. First of all, it has to read the full page every time, so any incremental reads are "fake" in that they filter out records emitted previously w.r.t. the state object. This also required dynamic urls based off the config, which felt a bit clunky to inject (maybe there's a better way than what I did). Another thing I ran into here is that date handling was pretty scary and would have been pretty easy to mess up, since I didn't realize that Python just dropped timezone info by default if you don't get the datetime in a specific format and then specify I imagine this a common class of potential bugs for integrations? Maybe if the JSON Schema for stream includes a datetime there should be a unit test that at least fails initially if there isn't timezone information. Then the dev could opt out if they really don't want it, but it feels like in the vast majority of cases it'd be a mistake not to have timezone info. Overall I felt like building this went fairly smoothly though. Nothing was a huge time sink this time around. |
Oh mine, Jared is still on Airbyte payroll. |
Hello thanks @jrhizor someone will review the contribution next week! |
Hello! I'm going to be out of the office this Friday and won't be able to review your contribution again today, I return next Monday. So far, most contributions look solid and are almost done to be approved. As said in Chris' comment all contributions made before 2-November are eligible to receive the prize and have 2 weeks to merge the contributions. But I ensure next week we're going to have your contribution merged. If you have questions about the implementation you can send them in Sorry the inconvenience and see you again next week, thank you so much for your contribution! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jrhizor thanks for your PR, I'm one of the Airbyte maintainers and have reviewed your code. Only one small stream change and a comment to answer.
Your PR was initially failing integration tests so I made some additional changes to this PR to make things pass.
airbyte-integrations/connectors/source-rss/integration_tests/catalog.json
Outdated
Show resolved
Hide resolved
Build is passing, I will do the version bump and merge/approve once the additional TODO and the question is answered.
|
@marcosmarxm @sajarin we need to add a config to github actions CI/CD in order to pass this connector. I used the sample provided. {
"url": "https://www.nasa.gov/rss/dyn/breaking_news.rss"
} |
airbyte-integrations/connectors/source-rss/source_rss/source.py
Outdated
Show resolved
Hide resolved
/test connector=connectors/source-rss
Build PassedTest summary info:
|
/publish connector=connectors/source-rss run-tests=false
if you have connectors that successfully published but failed definition generation, follow step 4 here |
* add source-rss * update tests * add docs * fix doc link * changes to pass tests * update catalog * Update test_streams.py * fix time zone issue * update source def * auto-bump connector version Co-authored-by: Vincent Koc <[email protected]> Co-authored-by: Octavia Squidington III <[email protected]> Co-authored-by: Marcos Marx <[email protected]>
resolves airbytehq/connector-contest#218 (+$500)
Pre-merge Checklist
Expand the relevant checklist and delete the others.
New Connector
Community member or Airbyter
airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
docs/integrations/<source or destination>/<name>.md
including changelog. See changelog exampledocs/integrations/README.md
airbyte-integrations/builds.md
Airbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>
command is passing/publish
command described hereTests
Unit
Integration
Acceptance