-
Notifications
You must be signed in to change notification settings - Fork 4.6k
🐛 Source S3: fix bug when importing parquet files with null datetimes #35650
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 Source S3: fix bug when importing parquet files with null datetimes #35650
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
@dtiesling thanks for the fix! Didn't realize we were working on this at the same time (I had a different issue as an entrypoint into investigating this). I have a stack of 2 PRs here that I think resolves the issue in the same way. Do you agree? If so, I'd continue there as it also addresses some typing problems and tests the None case for all the scalar types. |
@@ -1,5 +1,8 @@ | |||
# Changelog | |||
|
|||
## 0.67.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be best to remove this change from the PR: since the CDK publication workflow automatically updates airbyte-cdk/python/CHANGELOG.md
, it's best to rely on that rather than manually updating—that way there are no git conflicts between updates that are developed simultaneously, the changelog and version tags will be properly sorted in order of publication.
@dtiesling did you have time to take a look in Alex's comments? Also can you solve the conflicts? |
Closed due inactivity. @dtiesling let me know if you want to return the work here. |
What
Closes #34151
When importing a parquet file from S3 an exception is raised if a datetime column has a null value.
How
Refactors the parquet_parser
_to_output_value
method to test for None values and return early. This likely solves for the same bug in other column types.Recommended reading order
parquet_parser.py
🚨 User Impact 🚨
Are there any breaking changes? What is the end result perceived by the user?
No changes. Expected behavior should start occurring.
For connector PRs, use this section to explain which type of semantic versioning bump occurs as a result of the changes. Refer to our Semantic Versioning for Connectors guidelines for more information. Breaking changes to connectors must be documented by an Airbyte engineer (PR author, or reviewer for community PRs) by using the Breaking Change Release Playbook.
Should be a point release since this is a bug fix only.
If there are breaking changes, please merge this PR with the 🚨🚨 emoji so changelog authors can further highlight this if needed.
No breaking changes.
Pre-merge Actions
Expand the relevant checklist and delete the others.
New Connector
Community member or Airbyter
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.0.0.1
Dockerfile
has version0.0.1
README.md
bootstrap.md
. See description and examplesdocs/integrations/<source or destination>/<name>.md
including changelog with an entry for the initial version. See changelog exampledocs/integrations/README.md
Airbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
Updating a connector
Community member or Airbyter
Airbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
Connector Generator
-scaffold
in their name) have been updated with the latest scaffold by running./gradlew :airbyte-integrations:connector-templates:generator:generateScaffolds
then checking in your changesUpdating the Python CDK
Airbyter
Before merging:
--use-local-cdk --name=source-<connector>
as optionsairbyte-ci connectors --use-local-cdk --name=source-<connector> test
After merging: