Skip to content

Commit 0232182

Browse files
shrodingersalafanechereedmunditoevantahlerTim Roes
committed
Shrodingers/destination databricks dbt (#1)
* octavia-cli: fix workspace not having anonymous_data_collection property (#13869) * Update connection update calls to use central utility to ensure connection update has all data (#13564) * Update connection updates with build update utility * Add buildConnectionUpdate utility * Update components that update the connection to use utility when necessary * Use conection name when saving connection from replication view to prevent override from refreshed catalog * Improve connection check on ReplicationView onSubmit function * Display connection state in connection setting page (#13394) * Display Connection State in Setting page * memoize callback * rendering and confirmaton * setState API * Input validation * remove JSON step * rename apiMethod to `updateState` * test and adjust route * skip if sync is running * prevent state update when sync is running * code editor component * errors fixed * scss style * make linter happy * Back to monaco editor * Remove ability to edit state * Adjust FE code * Fix CSS problem * Update airbyte-webapp/src/locales/en.json Co-authored-by: Edmundo Ruiz Ghanem <[email protected]> * just use PRE to render state for now Co-authored-by: Tim Roes <[email protected]> Co-authored-by: Edmundo Ruiz Ghanem <[email protected]> * update api for per stream (#13835) * Update airbyte-protocol.md (#13892) * Update airbyte-protocol.md * Fix typo * Fix prose * Add protocol reviewers for protocol documentation * Remove duplicate * Edited Amplitude, Mailchimp, and Zendesk Support docs (#13897) * deleting SUMMARY.md since we don't need it for docusaurus builds (#13901) * Do not hide unexpected errors in the check connection (#13903) * Do not hide unexpected errors in the check connection * Fix test * Common code to deserialize a state message in the new format (#13772) * Common code to deserialize a state message in the new format * PR comments and type changed to typed * Format * Add StateType and StateWrapper objects to the model * Use state wrapper instead of Either * Switch to optional * PR comments * Support array legacy state * format Co-authored-by: Jimmy Ma <[email protected]> * 🐛 Source Amazon Seller Partner: handle start date for financial stream (#13633) * start and end date for finacial stream should not be more than 180 days apart * improve unit tests * make changes to start date for finance stream * update tests * lint changes * update version to 0.2.22 for source-amazon-seller-partner * Normalization: Fix incorrect jinja2 macro `json_extract_array` call (#13894) Signed-off-by: Sergey Chvalyuk <[email protected]> * Docs: fixed the broken links (#13915) * 0.2.5 -> 0.2.6 (#13924) Signed-off-by: Sergey Chvalyuk <[email protected]> * 13546 Fix integration tests source-postgres Mac OS (#13872) * 13546 Fix integration tests source-postgres Mac OS * 13548 Fixed integration tests source-tidb Mac OS (#13927) * Source MsSql : incr ver to include changes #13854 (#13887) * incr version * put PR id * docker ver * connectors that published (#13932) * Deprecate PART_SIZE_MB in connectors using S3/GCS storage (#13753) * Removed part_size from connectors that use StreamTransferManager * fixed S3DestinationConfigTest * fixed S3JsonlFormatConfigTest * upadate changelog and bump version * auto-bump connector version * auto-bump connector version * auto-bump connector version * auto-bump connector version * upadate changelog and bump version for Redshift and Snowflake destinations * auto-bump connector version * fix GCS staging test * fix GCS staging test * auto-bump connector version Co-authored-by: Octavia Squidington III <[email protected]> * Reverted changes in SshBastionContainer (#13934) * 🎉 New Source Dockerhub (#13931) * init * implement working source + tests * add docs * add docs * fix bad comments * Update airbyte-integrations/connectors/source-dockerhub/acceptance-test-config.yml * Update airbyte-integrations/connectors/source-dockerhub/Dockerfile * Update airbyte-integrations/connectors/source-dockerhub/.dockerignore * Apply suggestions from code review * Update docs/integrations/sources/dockerhub.md * Update airbyte-integrations/connectors/source-dockerhub/integration_tests/acceptance.py Co-authored-by: George Claireaux <[email protected]> * address @Phlair's feedback * address @Phlair's feedback * each record is now a Docker image rather than response page * format * fix unit tests * fix acceptance tests * add icon, definition and generate seed spec * add requests to requirements Co-authored-by: sw-yx <[email protected]> * commented out non-relevant tests (#13940) * Bump Airbyte version from 0.39.20-alpha to 0.39.21-alpha (#13938) Co-authored-by: alafanechere <[email protected]> * newaction (#13942) * remove test action (#13944) * 🎉Source-mysql: aligned datatype test (#13945) * [13607] source-mysql: aligned datatype tests for regular and CDC ways + added CHAR fix to CDC processing * #13958 Source Stripe: fix configured catalogs (#13959) * 🐛 Source: Typeform - Update schema for Responses stream (#13935) * Upd responses schema * Upd docs * auto-bump connector version Co-authored-by: Octavia Squidington III <[email protected]> * :window: Updated email invitation flow that enables invited users to set name and create password (#12788) * First pass accepting email link invitation * Update Auth service with signInWithEmailLink calls * Add AcceptEmailInvite component * Update FirebaseActionRoute to handle sign in mode * Rename ResetPasswordAction to FirebseActionRoute * Add create password setp to AcceptEmailInvite component * Remove continueURL from invite fetch * Update accept email invite for user to enter both email and password together * Set name during email link signup * Update AcceptEmailInvite to send name * Add updateName to UserService * Update AuthService to set name during sign up * Remove steps from AcceptEmailInvite component Remove setPassword from AuthService * Add header and title to accept invite page * Move invite error messages to en file * For invite link pages, show login link instead of sign up * Disable name update on sign in via email lnk * Resend email invite when the invite link is expired * Fix status message in accept email invite page * Re-enable set user's name during sign up email invite * Update signUpWithEmailLink so that sign up is successful even if we fail to update the user's name * Update comments on GoogleAuthService signInWithEmailLink * Add newsletter and accept terms checkboxes to accept email invite component * Extract signup form from signup page * Extract fields from signup form * Update accept email invite component to use field components from signup form * Ensure that sign up button is disable until form is valid and security checkbox is checked * Make error status text color in accept email link red * Update workspace check in DefaultView so that user lands in workspace selector when there are no workspaces * Add coment around continueUrl param usage in UserService * Remove usless default case in GoogleAuthService * Source Marketo: process fail during creation of an export job (#13930) * #9322 source Marketo: process fail during creation of an export job * #9322 source marketo: upd changelog * #9322 source marketo: fix unit test * #9322 source marketo: fix SATs * auto-bump connector version Co-authored-by: Octavia Squidington III <[email protected]> * :window: :wrench: Add eslint rules for CSS modules (#13952) * add eslint-plugin-css-modules rules * Fixes: - turn on eslint css modules rule as error - remove unused styles * add warning message if styled components is used * Revert "add warning message if styled components is used" This reverts commit 4e92b8b2110142bb679f15aeb034e377e0dcc69c. * replace rule severity with words * Update salesforce.md Fixed broken link * :window: 🔧 Add auto-fixable linting rules to webapp (#13462) * Add new eslint rules that fit with our code style and downgrade rules to warn * allowExpressions in fragment eslint rule * Enable function-component-definition in eslint and fix styles * Cleanup lint file * Fix react/function-component-definition warnings manually * Add more auto-fixable rules and fix * Fix functions that require usless returns * Update array-type rule to array-simple * Fix eslint errors manually disable assignmentExpression for arrays in prefer-destructuring rule * Auto fix new linting issues after rebase * Enhance /publish to allow for multiple connectors and parallel execution (#13864) * start * revert * azblob * bq * bq denorm * megapublish baaaabyyyy * fix needs * matrix connectors * auto-bump connector version * dont failfast and max parallel 5 * multi runno * minor * testing matrix agents * name * testing multi agents * tmp fix * new multi agents * multi test * tryy * let's do this * magico * fix * label test * couple more connector bumps * temp * things * check this * lets gooo * more connectors * Delete TEMP-testing-command.yml * auto-bump connector version * added comment describing bash part * running single thread * catch sentry cli * auto-bump connector version * destinations * + snowflake * saved * auto-bump connector version * auto-bump connector version * java source bumps * auto-bump connector version * auto-bump connector version * auto-bump connector version * auto-bump connector version * auto-bump connector version * auto-bump connector version * auto-bump connector version * auto-bump connector version * auto-bump connector version * auto-bump connector version * auto-bump connector version * auto-bump connector version * auto-bump connector version * auto-bump connector version * auto-bump connector version * auto-bump connector version * auto-bump connector version * auto-bump connector version * auto-bump connector version * remove twice-defined methods * label things * revert action * using the new test action * point at action * wrong tag on action * update pool label * update to use new ec2-github-runner fork * this needs to be more generic than publisher * change publish to run on pool * add comment about runner-pool usage * updated publish command docs for multi & parallel connector runs * auto-bump connector version * auto-bump connector version * auto-bump connector version * unbump failed publish versions * missed dockerfiles * remove failed docs * mssql fix * overhauled the git comment output * bumping a test connector that should work * slight order switcheroo * output connectors properly in first message * auto-bump connector version Co-authored-by: Octavia Squidington III <[email protected]> * Bump Airbyte version from 0.39.21-alpha to 0.39.22-alpha (#13979) Co-authored-by: Phlair <[email protected]> * Parker/temporal cloud (#13243) * switch to temporal cloud client for now * format * use client cert/key env secret instead of path to secret * add TODO comments * format * add logging to debug timeout issue * add more logging * change workflow task timeout * PR feedback: consolidate as much as possible, add missing javadoc * fix acceptance test, needs to specify localhost * add internal-use only comments * format * refactor to clean up TemporalClient and prepare it for future dependency injection framework * remove extraneous log statements * PR feedback * fix test * return isInitialized true in test * 📄 Postgres source: fix CDC setup order in docs (#13949) * postgres source: fix CDC setup order docs * Update docs/integrations/sources/postgres.md Co-authored-by: Liren Tu <[email protected]> * Per-stream state support for Postgres source (#13609) * WIP Per-stream state support for Postgres source * Fix failing test * Improve code coverage * Make global the default state manager * Add legacy adapter state manager * Formatting * Include legacy state for backwards compatibility * Add global state manager * Implement Global/CDC state handling * Fix test issues * Fix issue with updated method signature * Handle empty state case in global state manager * Adjust to protocol changes * Fix failing acceptance tests * Fix failing test * Fix unmodifiable list issue * Fix unmodifiable exception * PR feedback * Abstract global state manager selection * Handle conversion between different state types * Handle invalid conversion * Rename parameter * Refactor state manager creation * Fix failing tests * Fix failing integration tests * Add CDC test * Fix failing integration test * Revert change * Fix failing integration test * Use per-stream for postgres tests * Formatting * Correct stream descriptor validation * Correct permalink * PR feedback * Bump Airbyte version from 0.39.22-alpha to 0.39.23-alpha (#13984) Co-authored-by: pmossman <[email protected]> * Adds test for new workflow (#13986) * Adds test for new workflow * Adds airbyte repo * remove testing line * Add new InterpolatedRequestOptionsProvider that encapsulates all variations of request arguments (#13472) * write out new request options provider and refactor components and parts of the YAML config * fix formatting * pr feedback to consolidate body_data_provider to simplify the code * pr feedback get rid of extraneous optional * publish oss for cloud (#13978) workflow to publish oss artifacts that cloud needs to build against use docker buildx to create arm images for local development * skip debezium engine startup in case no table is in INCREMENTAL mode (#13870) * 🎉 Source Github: break point added for workflows_runs stream (#13926) Signed-off-by: Sergey Chvalyuk <[email protected]> * 6339: error when attempting to use azure sql database within an elastic pool as source for cdc based replication (#13866) * 6339: debug info * 6339: not using 'USE' on Azure SQL servers * 6339: cleanup * 6339: cleanup2 * 6339: cleanup3 * 6339: versions/changelogs updated * 6339: merge from master (consolidation issue) * 6339: dev connector version (for testing in airbyte cloud) * 6339: code review implementation * 6339: apply formatting * in case runners fail to spin up, this needs to run on github-hosted (#13996) * 12708: Add an option to use encryption with staging in Redshift Destination (#13675) * 12708: Add an option to use encryption with staging in Redshift Destination * 12708: docs/docker configs updated * 12708: merge with master * 12708: merge fix * 12708: code review implementation * 12708: fix for older configs * 12708: fix for older configs in check * 12708: merge from master (consolidation issue) * 12708: versions updated * :tada: New Source: Webflow (#13617) * Added webflow code * Updated readme * Updated README * Added webflow to source_definitions.yaml * Enhanced documentation for the Webflow source connector * Improved webflow source connector instructions * Moved Site ID to before API token in Spec.yaml (for presentation in the UI) * Addressed comments in PR. * Changes to address requests in PR review * Removed version from config * Minor udpate to spec.yaml for clarity * Updated to pass the accept-version as a constant rather than parameter * Updated check_connection to hit the collections API that requires both site id and the authentication token. * Fixed the test_check_connection to use the new check_connection function * Added a streams test for generate_streams * Re-named "autentication" object to "auth" to be more consistent with the way it is created by the CDK * Added in an explict line to instantiante an "auth" object from WebflowTokenAuthenticator, to make it easier to describe in the blog * Fixed a typo in a comment * Renamed some classes to be more intuitive * Renamed class to be more intuitive * Minor change to an internal method name * Made _get_collection_name_to_id_dict staticmethod * Fixed a unit-test error that only appeared when running " python -m pytest -s unit_tests". This was caused by Mocked settings from test_source.py leaking into test_streams.py * format: add double quotes and remove unused import * readme: remove semantic version naming of connector in build commands * Updated spec.yaml * auto-bump connector version * format files * add changelog * update dockerfile * auto-bump connector version Co-authored-by: sajarin <[email protected]> Co-authored-by: Octavia Squidington III <[email protected]> Co-authored-by: marcosmarxm <[email protected]> * Source-oracle: fixed tests + checkstyle (#13997) * Source-oracle: fixed tests + checkstyle * 🐛Destination-mysql: fixed integration test and build process (#13302) * [13180] destination-mysql: fixed integration test * update changelog to include debezium version upgrade (#13844) * make table headers look less like successes (#13999) * source-twilio: implement lookback windows (#13896) * Revert "12708: Add an option to use encryption with staging in Redshift Destination (#13675)" (#14010) This reverts commit aa28d448d820df9d79c2c0d06b38978d1108fb2c. * Revert "6339: error when attempting to use azure sql database within an elastic pool as source for cdc based replication (#13866)" (#14011) This reverts commit 0d870bd37bc3b5cd798b92115d73bcc45a42d8f7. * [low-code connectors] BasicHttpAuthenticator (#13733) * implement basichttpauthenticator * add optional refresh access token authenticator * remove prints * type hints * Fix and unit test * missing test * Add class to __init__ file * Add comment * migrate JsonSchemas to use basic path instead of JSONPath (#13917) * scaffold for catalog diff, needs fixing on type handling and tests (#13786) * Prepare release of JDBC connectors (#13987) * Prepare release of JDBC connectors * Update source definitions manually * use built in check for if path is definite (#13834) * 13535 Fixed bastion network for integration tests (#14007) * doc: add error troubleshooting `docker-compose up` (#13765) * fix: duplicate resource allocations in `airbyte-temporal` deployment (#13816) * helm-chart: Fix worker deployment format error (#13839) * add catalog diff connection read (#13918) * doc: fix small typo on Shopify documentation (#13992) * add streams to reset to job info (#13919) * Generate api for changes in #13370 and make code compatible (#14014) * Generate api for per-stream updates #13835 (#14021) * Revert "Prepare release of JDBC connectors (#13987)" (#14029) This reverts commit df759b30778082508e2872513800fac34d98ff7c. * Fix per stream state protocol backward compatibility (#14032) * rename state type field to fix backwards compatibility issue * replace usages of stateType with type * support semi incremental by adding extractor record filter (#13520) * support semi incremental by adding extractor record filter * refactor extractor into a record_selector that supports extraction and filtering of response records * Remove pydantic spec from amazon ads and use YAML spec (#13988) * add EdDSA support in SSH tunnel (#9494) * add EdDSA support * verify EdDSA support works correct Co-authored-by: Yurii Bidiuk <[email protected]> * 🎉New source connector: source-metabase (#13752) * Add docs * Close metabase session when sync finishes * Close session in check_connection * Add source definition to seed * Add icon * improve cdc check for connectors (#14005) * improve should use cdc check * Revert "improve should use cdc check" This reverts commit 7d01727279d21d33a6c18ed3227ee94432636120. * improve should use cdc check * add unit test * Update webflow.md * Update webflow.md * Update webflow.md * Remove legacy sentry code from cdk (#14016) * rip sentry out of cdk * remove sentry dsn from gsc * Update webflow.md * Update webflow.md * Fixed broken links (#14071) * 🪟Persist unsaved changes on schema refresh (#13895) * add form values tracker context * add clarifying comment * add same functionality to create connection * Update airbyte-webapp/src/components/CreateConnectionContent/CreateConnectionContent.tsx Co-authored-by: Edmundo Ruiz Ghanem <[email protected]> Co-authored-by: Edmundo Ruiz Ghanem <[email protected]> * Fixes broken links so we can deploy again (#14075) also adds better error message for when this happens to others * Adds symmary.md to gitignore (#14078) * Added webflow icon (#14069) * Added webflow icon * Added icon * Build create connection form build failure (#14081) * Fix CDK obfuscation of nested secrets (#14035) * Added Buy Credits section to Managing Airbyte Cloud (#13905) * Added Buy Credits section to Managing Airbyte Cloud * Made some style changes * Made edits based on Natalie's suggestions * Deleted link * Deleted line * Edited email address * Updated reaching out to sales sentence * disable es-lit to fix build (#14087) * Release source connectors (#14077) * Release source connectors * Fix issue with database connection in test * Fix failing tests due to authentication * auto-bump connector version * auto-bump connector version * auto-bump connector version Co-authored-by: Octavia Squidington III <[email protected]> * Bump Airbyte version from 0.39.23-alpha to 0.39.24-alpha (#14094) Co-authored-by: jdpgrailsdev <[email protected]> * Emit the state to remove in the airbyte empty source (#13725) What This PR updates the EmptyAirbyteSource in order to perform a partial update and handle the new state message format. How The empty will now emit different messages based on the type of state being provided: Per stream: it will emit one message per stream that have been reset Global: It will emit one global message that will contain null for the stream that have been reset including the shared state Co-authored-by: Jimmy Ma <[email protected]> * Add StatePersistence object (#13900) Add a StatePersistence object that supports Read/Writes of States to the DB with StreamDescriptor fields The only migrations that is supported are * moving from LEGACY to GLOBAL * moving from LEGACY to STREAM * All other state type migrations are expected to go through an explicit reset beforehand. * secret-persistence: Hashicorp Vault Secret Store (#13616) Co-authored-by: Amanda Murphy <[email protected]> Co-authored-by: Benoit Moriceau <[email protected]> * 🐛 Source Hubspot: remove `AirbyteSentry` dependency (#14102) * fixed * updated changelog * auto-bump connector version Co-authored-by: Octavia Squidington III <[email protected]> * fix: format VaultSecretPersistenceTest.java (#14110) * Source Hubspot: extend error logging (#14054) * #291 incall - source Hubspot: extend error logging * huspot: upd changelog * auto-bump connector version Co-authored-by: Octavia Squidington III <[email protected]> * Update webflow.md (#14083) * Update webflow.md Removed a description that is only applicable to people that are writing connector code, not to _users_ of the connector. * Update webflow.md * Update webflow.md * Update webflow.md * Update webflow.md * 12708: Add an option to use encryption with staging in Redshift Desti… (#14013) * 12708: Add an option to use encryption with staging in Redshift Destination (#13675) * 12708: Add an option to use encryption with staging in Redshift Destination * 12708: docs/docker configs updated * 12708: merge with master * 12708: merge fix * 12708: code review implementation * 12708: fix for older configs * 12708: fix for older configs in check * 12708: merge from master (consolidation issue) * 12708: versions updated * 12708: specs updated * 12708: specs updated * 12708: removing autogenerated files from PR * 12708: changelog updated * auto-bump connector version Co-authored-by: Octavia Squidington III <[email protected]> * Source PayPal Transaction: Update Transaction Schema (#13682) * Update transaction schema. * Transform money values from strings to floats or integers. Co-authored-by: nataly <[email protected]> Co-authored-by: Augustin <[email protected]> * fix(jsonSchemas): raise error when items property not provided (#14018) * fix stream name in stream transformation update (#14044) * 🐛 Destination Redshift: Improved discovery for redshift-destination not SUPER streams (#13690) airbyte-12843: Improved discovery for redshift-destination not SUPER tables, excluded views from discovery. * Remove skiptests option (#14100) * update sentry release script (#14123) * Remove "additionalProperties": false from specs for connectors with staging (#14114) * Remove "additionalProperties": false from spec for connectors with staging * Remove "additionalProperties": false from spec for Redshift destination * bump versions * auto-bump connector version * auto-bump connector version * auto-bump connector version * auto-bump connector version * auto-bump connector version * auto-bump connector version Co-authored-by: Octavia Squidington III <[email protected]> * [14003] source-oracle: added custom jdbc field (#14092) * [14003] source-oracle: added custom jdbc field * Add JobErrorReporter for sending sync job connector failures to Sentry (#13899) * skeleton for reporting connector errors to sentry * report on job failures instead of attempt failures * report sync job failures with relevant metadata using JobErrorReporter * send stack traces from python connectors to sentry * test JobCreationAndStatusUpdate and JobErrorReporter * logs * refactor into helper, initial tests * using sentry * run format * load reporting client from env * load sentry dsn from env * send java stack traces to sentry * test sentryclient, refactor to use Hub instance * ErrorReportingClient.report -> .reportJobFailureReason * inject exception helper, test stack trace parse error tagging * rm logs * more stack trace tests * remove logs * fix failing tests * rename ErrorReportingClient to JobErrorReportingClient * rename vars in docker-compose * Return an Optional instead of null when parsing stack traces * dont remove airbyte prefix when setting release name * from_trace_message static * remove failureSummary from jobfailure input, get from Job * send stacktrace string if we weren't able to parse * set deployment mode tag * update .env * just log if something goes wrong * Use StateMessageHelper in source (#14125) * Use StateMessageHelper in source * PR feedback and formatting * More PR feedback * Revert change * Revert changes * Bump Airbyte version from 0.39.24-alpha to 0.39.25-alpha (#14124) Co-authored-by: brianjlai <[email protected]> * Refactor acceptance tests and utils (#13950) * Refactor Basic acceptance tests and utils * Refactor Advanced acceptance tests and utils * Remove unused code * Clear destination db data during cleanup * Cleanup comments * cleanup init code * test creating new desintation db for each test * cleanup desintation db init * Allow to edit api client * pull in temporal cloud changes * Rename helper to harness; set some funcs to private; turn init into constructor * add func to set env vars instead of using static vars and move some functionality out of init into acceptance tests * update javadoc Co-authored-by: Davin Chia <[email protected]> * fix javadoc formatting * fix var naming Co-authored-by: Davin Chia <[email protected]> * Bump Airbyte version from 0.39.25-alpha to 0.39.26-alpha (#14141) Co-authored-by: terencecho <[email protected]> * 🎉 octavia-cli: Add ability to get existing resources (#13254) * 13541 Fixed integration tests source-db2 Mac OS (#14133) * 13523 Fix integration tests destination-cassandra Mac OS (#14134) * 🐛 Source Hubspot: fixed SAT test, commented out expected_records (#14140) * :bug: Source Intercom: extend `Contacts` schema with new properties (#14099) * Source Twilio: adopt best practices (#14000) * #1946 Source twilio: aopt best practices - tune tests * #1946 add expected_records to acceptance-test-config.yml * #1946 source twilio - upd schema and changelog * #1946 fix expected_records * #1946 source twilio: rm alerts from expected records as they expire in 30 days * #1946 source twilio: bump version * 🎉 Source BingAds: expose hourly/daily/weekly/monthly options from configuration (#13801) * #12489 - expose hourly/daily/weekly/monthly reports in discovery by default instead of in the connector's configuration settings removed: config settings for hourly/daily/weekly/monthly reports added: default value for all periodic reports to True * #12489 - expose hourly/daily/weekly/monthly reports in discovery by default instead of in the connector's configuration settings removed: unused class variables, if-statement * #12489 - expose hourly/daily/weekly/monthly reports in discovery by default instead of in the connector's configuration settings removed: unused variables from config * auto-bump connector version Co-authored-by: Octavia Squidington III <[email protected]> * remove VersionMismatchServer (#14076) * remove VersionMismatchServer * remove VersionMismatchServerTest * revert intended changes * Increase instance termination time limit to 3 hours to accommodate connector builds. (#14181) * Use correct bash comment symbol. (#14183) * 🎉 New Source: Orbit.love (#13390) * source-orbit: add definition and specs (#14189) * 🎉 Base Norrmalization: clean-up Redshift `tmp_schemas` after SAT (#14015) Now after `base-normalization` SAT the Destination Redshift will be automatically cleaned up from test leftovers. Other destinations are not covered yet. * Source Salesforce: fix customIntegrationTest for SAT (#14172) * Source Amazon Ads: increase timeout for SAT (#14167) * 🎉 Introduce Google Analytics Data API source (#12701) * Introduce Google Analytics Data API source https://developers.google.com/analytics/devguides/reporting/data/v1 * Add Google Analytics Data API source PR link * Add `client` class for Google Analytics Data API * Move dimensions and metrics extraction to the `client` class In the Google Analytics Data API * Change the copyright date to 2022 in Google Analytics Data API * fix: removing incremental syncs * fix: change project_id to string * fix: flake check is failing * chore: added it to source definitions * chore: update seed file Co-authored-by: Harshith Mullapudi <[email protected]> * 🐛 Destination Redshift: use s3 bucket path for s3 staging operations (#13916) * Publish acceptance test utils maven artifact (#14142) * Fix StatePersistence Legacy read/write (#14129) StatePersistence will wrap/unwrap legacy state on write/read to ensure compatibility with the old behavior/data. * 🎉 Destination connectors: Improved "SecondSync" checks in Standard Destination Acceptance tests (#14184) * [11731] Improved "SecondSync" checks in Standard Destination Acceptance tests * 🐛 Source Zendesk Support: fixed "Retry-After" non integer value (#14112) Signed-off-by: Sergey Chvalyuk <[email protected]> * Source Tiktok Marketing: Videometrics (#13650) * added video metrics in streams.py * common metrics list updated. * updated streams.py with extended metrics required. * updated stream_test * updated configured_catalog * video metrics required list updated. * chore: formatting * chore: bump version in source definitions * chore: update seed file Co-authored-by: Harshith Mullapudi <[email protected]> * 🎉 Source Github: secondary rate limits has to retry (#13955) Signed-off-by: Sergey Chvalyuk <[email protected]> * Harshith/test pr 13118 (#14192) * Firebolt destination * feat: Write method dropdown * feat: Use future-proof Auth in SDK * refactor: Move writer instantiation * fix: tests are failing * fix: tests are failing * fix: tests are failing * chore: added connector to definitions * fix: formatting and spec * fix: formatting for orbit Co-authored-by: ptiurin <[email protected]> * 🪟 :art: Show credit usage on chart's specific day (#13503) * add tooltip to chart * Fixes: - update main chart color; - change onHover background color * change chart color pallet to grey 500 * update color reference * remove opacity from UsageCell * 🐛 destination-redshift: use s3 bucket path for s3 cleanup (#14190) * Improve documentation for Postgres Source (#13830) * Improve documentation for Postgres Source * add information about additional JDBC params * add anchors for doc sections * fix link to CDC on Bare Metal * add more details about parsing date/time values * add doc link to SSH fields * Handle null reset source config (#14202) * handle null reset source config * format * Wait indefinitely if connection is not active (#14200) * also wait indefinitely if connection is deleted * fix test * Bump Airbyte version from 0.39.26-alpha to 0.39.27-alpha (#14204) Co-authored-by: lmossman <[email protected]> * Bmoric/feature flag for state deserialization (#14127) * Add Feature flag * Add default feature flag value * Update test * remove unsused * tmp * Update tests * rm unwanted change * PR comments * [low-code connectors] default types and default values (#14004) * default types and default values * cleanup * fixes so read works * remove prints and trycatch * comment * remove unused param * split file * extract method * extract methods * comment * optional * fix test * cleanup * delete interpolated request header provider * simplify next page url paginator interface * comment * format * add state type endpoint (#14111) * Bump Airbyte version from 0.39.27-alpha to 0.39.28-alpha (#14210) Co-authored-by: sherifnada <[email protected]> * 🐛 source-orbit: remove workspace_old.json (#14208) * Fix: Docs plural login redirecting to wrong URL (#14207) * [docs] fix numbering and incorrect filename in CDK docs (#13045) * [docs] fix numbering in CDK docs * Update 5-declare-schema.md * Update 5-declare-schema.md * Update 6-read-data.md * Update 8-test-your-connector.md * Remove the old scheduler from HelmCharts helper (#14187) * Remove the old scheduler from HelmCharts helper The old scheduler was removed as part of https://github.com/airbytehq/airbyte/pull/13400 * Remove legacy `scheduler` comment in HelmCharts * Source Gitlab: add GroupIssueBoards stream (#13252) * GitLab Source: add GroupIssueBoards stream * Address stream schema comments * Address comments * Bump version * Add as empty stream * run seed file source (#14215) * fix 'cannot reach server' error on demo instance (#10020) * Update CODEOWNERS (#14209) * 🎉 Source Github: use GraphQL for `reviews` stream (#13989) Signed-off-by: Sergey Chvalyuk <[email protected]> * workflow for publishing artifacts for cloud (#14199) * fix sentry org slug change (#14218) * Source File: correct spec json to match json format (#13738) * Upgrade spotless version and remove jvmargs workaround (#13705) * Source Zendesk Chat: Process large amount of data in batches for incremental (#14214) * increased the limit of itens in request * Configuration for max api pages on requests * included api_pagination_limit in sample * included api_pagination_limit in invalid_config * creating new table for chat_session * reverted api_pagination_limit approach * removed api_pagination_limit from TimeIncrementalStream * correct chat json * bump connector version * add changelog * run format * auto-bump connector version Co-authored-by: Roberto Bonnet <[email protected]> Co-authored-by: Octavia Squidington III <[email protected]> * Remove all @ts-ignore (#14221) * Bump hadoop to use version 3.3.3 (#14182) * Change the persistence activity to use the new persistence layer (#14205) * Change the persistence activity to use the new persistence layer * Use lombok * format * Use new State message helper * Fix build (#14225) * Fix build * Fix test * Use new state persistence for state reads (#14126) * Inject StatePersistence into DefaultJobCreator * Read the state from StatePersistence instead of ConfigRepository * Add a conversion helper to convert StateWrapper to State * Remove unused ConfigRepository.getConnectionState * Temporal per stream resets (#13990) * remove reset flags from workflow state + refactor * bring back cancelledForReset, since we need to distinguish between that case and a normal cancel * delete reset job streams on cancel or success * extract isResetJob to method * merge with master * set sync modes on streams in reset job correctly * format * Add test for getAllStreamsForConnection * fix tests * update more tests * add StreamResetActivityTests * fix tests for default job creator * remove outdated comment * remove debug lines * remove unused enum value * fix tests * fix constant equals ordering * make job mock not static * DRY and add comments * add comment about deleted streams * Remove io.airbyte.config.StreamDescriptor * regisster stream reset activity impl * refetch connection workflow when checking job id, since it may have been restarted * only cancel if workflow is running, to allow reset signal to always succeed even if batched with a workflow start * fix reset signal to use new doneWaiting workflow state prop * try to fix tests * fix reset cancel case * add acceptance test for resetting while sync is running * format * fix new acceptance test * lower sleep on test * raise sleep * increase sleep and timeout, and remove repeated test * use CatalogHelpers to extract stream descriptors * raise sleep and timeout to prevent transient failures * format Co-authored-by: alovew <[email protected]> * fix PostgresJdbcSourceAcceptanceTest by activating the feature flag (#14240) * fix PostgresJdbcSourceAcceptanceTest by activating the feature flag * fix AbstractJdbcSourceAcceptanceTest as well * fix expected_spec for strict encrypt * [13539] Fix integration tests source-clickhouse Mac OS (#14201) * [13539] Fix integration tests source-clickhouse Mac OS fixed unit tests * [13524] Fix integration tests destination-clickhouse Mac OS fixed unit tests * 6339: error when attempting to use azure sql database within an elastic pool as source for cdc based replication (#14121) * 6339: implementation * 6339: changelog updated * 6339: definitions updated * 6339: definitions reverted * 6339: still struggling with publishing * auto-bump connector version * 6339: definitions reverted - correct * auto-bump connector version Co-authored-by: Octavia Squidington III <[email protected]> * 🪟 🎨 Update favicon and table row image styles (#14020) * style changes to favicon and imageblock * fix import * revert component and props names to block * Update airbyte-webapp/src/components/ImageBlock/ImageBlock.tsx Co-authored-by: Edmundo Ruiz Ghanem <[email protected]> * Update airbyte-webapp/src/components/ImageBlock/ImageBlock.module.scss Co-authored-by: Vladimir <[email protected]> * Update airbyte-webapp/src/components/ImageBlock/ImageBlock.tsx Co-authored-by: Edmundo Ruiz Ghanem <[email protected]> * Update airbyte-webapp/src/components/ImageBlock/ImageBlock.module.scss Co-authored-by: Vladimir <[email protected]> * add storybook Co-authored-by: Edmundo Ruiz Ghanem <[email protected]> Co-authored-by: Vladimir <[email protected]> * upgrade potgresql version to fix default timestamp handling (#14211) * implement logic to trigger snapshot of new tables via debezium (#13994) * implement logic to trigger snapshot of new tables via debezium * format * improve test condition * fix build * BigQuery Denormalized "airbyte_type": "big_integer" to INT64 (#14079) * BigQuery Denormalized "airbyte_type": "big_integer" to INT64 * updated changelog * added unit test * removed star import * fixed checkstyle * bump version * auto-bump connector version Co-authored-by: Octavia Squidington III <[email protected]> * Add Metrics section to Scaling Airbyte doc (#14224) * Added metrics section to scaling airbyte doc * Updated URL in doc * Deleted link * Added link * Added backslashes before brackets that aren't links * Edited note about tagged metrics * Changed list * Changed spacing * Changed spacing * Changed spacing * Deleted period * Fixed broken firebolt link * Added tables * Cleaned up wording in tables * Add ability to provide source/destination connector docker image (#14266) * Add ability to provide source/destination connector docker image * Make constant public * Bump Airbyte version from 0.39.28-alpha to 0.39.29-alpha (#14232) * disable flaky cmw test temporarily (#14269) * release new postgres source connector version 0.4.29 (#14265) * release new postgres source connector version 0.4.29 * add changelog * auto-bump connector version Co-authored-by: Octavia Squidington III <[email protected]> * :tada: Source Tiktok marketing - remove granularity config option (#13890) * Removed granularity config option from spec, added corresponsing streams for each support granularity (hourly daily, lifetime), updated unittests, SAT * auto-formating * auto-formating * removed AdvertisersIds stream from list of exposed streams, updated docs * expose new style streams since 0.1.13, expose old streams for config for older version * update spec * fixed path to catalog * increased timeout * source bing-ads to ga (#13679) * Source Tiktok marketing - increase connector version (#14272) * increased connector version * increased connector version in seed * auto-bump connector version Co-authored-by: Octavia Squidington III <[email protected]> * Fix flaky connection manager workflow test (#14271) * try thread sleep instead of test env, and run 100 times * replace testEnv.sleep with Thread.sleep in several tests * replace RepeatedTest with Test * replace testEnv.sleep with Thread.sleep after signals are executed * run each test 100 times to see if any are flaky * add log * change repetitions to 100 to avoid out of memory * format * swap repeated test for normal test * 13532 Fixed integration tests destination-mssql Mac OS (#14252) * 13532 Fixed integration tests destination-mssql Mac OS * Source Google Analytics: Specify integer for dimension ga:dateHourMinute (#14298) * Specify integer for dimension ga:dateHourMinute * Update changelog * 🎉 Source Github: rename field `mergeable` to `is_mergeable` (#14274) Signed-off-by: Sergey Chvalyuk <[email protected]> * Update Airbyte Client (#14270) * #12668 #13198 enable full refresh, disable incremental and expected_records (#14191) * 🎉 Destination S3: update INSTANCE_PROFILE to use AWSDefaultProfileCredential (#14231) Co-authored-by: Mike Balmer <[email protected]> * Source Zendesk Support: pagination group membership (#14304) * add next_page_tooken and request * correct group_membership paginatin * update doc * auto-bump connector version Co-authored-by: Octavia Squidington III <[email protected]> * 🪟 🐛 Fix OAuth validation not allowing to create source or destination (#14197) * Enable "Set up source/destination" button only if the form is valid * Update how ServiceForm initial values are patched so that it correctly patches the configuration with default values * Update initial values patching in service form to use initialValues to preserve already set values Update useOAuthFlowAdapter to correctly merge the values from the oauth response * Remove unused values var from ServiceForm * Add acceptance tests for per-stream state updates (#14263) * Add acceptance tests for per-stream state updates * PR feedback * Formatting * More PR feedback * PR feedback * Remove unused constant * Make sure that the feature flag is transfer to container (#14314) * Make sure that the feature flag is transfer to container * propagate the feature flags * Avoid propagating the feature flags * Fix tests * Source Postgres : use more simple and comprehensive query to get selectable tables (#14251) * use more simple and comprehensive query to get selectable tables * cover case when schema is not specified * add test to check discover with different ways of grants * format * incr ver * incr ver * auto-bump connector version Co-authored-by: Octavia Squidington III <[email protected]> * Fixed broken link * Fix for deleting stream resets (#14322) * Fix for deleting stream resets * Fix build by updating var (#14321) * Edited formatting (#14275) * Avoid error when creating dupl stream reset (#14328) * Bump Airbyte version from 0.39.29-alpha to 0.39.30-alpha (#14329) Co-authored-by: lmossman <[email protected]> * Release new postgres strict encrypt version (#14331) * Bump postgres strict encrypt version * Update changelogs * Update doc * Release new destination s3 version to pick up latest change (#14332) * Bump s3 version * Update pr id * auto-bump connector version Co-authored-by: Octavia Squidington III <[email protected]> * 13538 Fix integration tests destination-scylla Mac OS (#14308) * 13538 Fix integration tests destination-scylla Mac OS * Update cdk-speedrun.md (#14258) Added a link at the bottom of the article , so the user may find the more in-depth tutorial about building a real-world connector. * Update README.md (#14303) Added a link to https://airbyte.com/tutorials/extract-data-from-the-webflow-api in Webflow's README.md * Update building-a-python-source.md (#14262) * Update webflow.md (#14254) Added a link to the new blog - https://airbyte.com/tutorials/extract-data-from-the-webflow-api Co-authored-by: Simon Späti <[email protected]> * Alex/declarative stream incremental fix (#14268) * checkout files from test branch * read_incremental works * reset to master * remove dead code * comment * fix * Add test * comments * utc * format * small fix * Add test with rfc3339 * remove unused param * fix test * 🐛 SingerSource: Fix incompatibilities and typing issues (#14148) * Use logging.Logger in SingerSource * Fix SingerSource ConfigContainer This fixes typing issues with `ConfigContainer` and makes it compatible with `split_config`. Fixes #8710. * Fix SingerSource state and catalog typer issues * Rename SingerSource method args to match parent classes * Remove old comment about excluding Singer Co-authored-by: Alexandre Girard <[email protected]> * Update source postgres release stage to beta (#14326) * fix NPE (#14353) * fix NPE * Add test * Fix trailing * 🎉 octavia-cli: Add ability to import existing resources (#14137) * helm chart: Add Image Pull Secrets Param (#14031) * fix format (#14354) * Bump Airbyte version from 0.39.30-alpha to 0.39.31-alpha (#14355) Co-authored-by: benmoriceau <[email protected]> * tiktok to ga (#14358) * Update state.state type (#14360) * Run some DATs as part of base-normalization tests (#14312) * Revert "🎉 Source Github: rename field `mergeable` to `is_mergeable` (#14274)" (#14338) * Revert "🎉 Source Github: rename field `mergeable` to `is_mergeable` (#14274)" * Properly update the hasEmitted state (#14367) * Bmoric/state aggregator (#14364) * Update state.state type * Add state aggregator * Test and format * PR comments * Move to its own package * Update airbyte-workers/src/test/java/io/airbyte/workers/internal/state_aggregator/StateAggregatorTest.java Co-authored-by: Lake Mossman <[email protected]> * format * Update airbyte-workers/src/main/java/io/airbyte/workers/internal/state_aggregator/DefaultStateAggregator.java Co-authored-by: Lake Mossman <[email protected]> * format Co-authored-by: Lake Mossman <[email protected]> * Bump Airbyte version from 0.39.31-alpha to 0.39.32-alpha (#14383) Co-authored-by: alafanechere <[email protected]> * 🐛 Source Mixpanel: fix SAT tests (#14349) * Call the new revoke_user_session endpoint from the FE (#13165) * Source Instagram: change releaseStage to GA (#14162) * Source Google Analytics: Change releaseStage to GA (#13957) * source-outreach: fix record parsing and cursor field access (#14386) * Kustomize: Use `resources` since `bases` is deprecated (#14037) * fix: clone api doesn't take update configurations (#13592) * fix: clone api doesn't take update configurations * fix: you will be able to create clone in different workspace * fix: added description to source/destination body * cdk: Attach namespace to stream in catalog (#13923) * Source TiDB: correct jdbc string builder (#14243) * add icon for tidb-connector * Fix TiDB source connector * bump connector version * auto-bump connector version Co-authored-by: marcosmarxm <[email protected]> Co-authored-by: Octavia Squidington III <[email protected]> * Source Google Ads: use docsaurus feature for warn/note and udpdate doc (#14392) * use docsaurus feature for warn/note and udpdate doc * update description in supported streams * Source Facebook Marketing: allow configuration of MAX_BATCH_SIZE (#14267) * Add max batch size config * Bump semver * add changelog * auto-bump connector version Co-authored-by: Octavia Squidington III <[email protected]> * 🎉 Source Github: add Retry for GraphQL API Resource limitations (#14376) Signed-off-by: Sergey Chvalyuk <[email protected]> * Add more metadata to the JobErrorReporter (#14395) * add workspace_id and connector_repository as tags * add tag for connection url * fix urls for job notifier * format * fix failing test * beta -> generally_available (#14315) Signed-off-by: Sergey Chvalyuk <[email protected]> * helm chart: Fix/double printing of extra volume mounts (#14091) * SentryJobErrorReporter: better handling of multiline chained java exceptions (#14398) * Docs: deploy on gcp use docusaurus tabs (#14401) * Revert "Kustomize: Use `resources` since `bases` is deprecated (#14037)" (#14415) This reverts commit 5c9a6a5fc655a9e597f755be8fc8ccf805a2537a. * Use Debezium Postgres image for CDC tests (#14318) * Use Debezium Postgres image for CDC tests * Formatting * 🎉 octavia-cli: Add ability to import all resources (#14374) * Bump Airbyte version from 0.39.32-alpha to 0.39.33-alpha (#14419) Co-authored-by: pedroslopez <[email protected]> * 📝 MySql source: clarify tinyint to number conversion when size > 1 (#14424) * 🪟 🐛 Fix Setup Source Button on OAuth Sources (#14413) * don't disable setup button * make eslint happy * one more cleanup * use the spec to decide how to create config object * Bump Airbyte version from 0.39.33-alpha to 0.39.34-alpha (#14428) Co-authored-by: timroes <[email protected]> * [low-code cdk] Enable configurable state checkpointing (#14317) * checkout files from test branch * read_incremental works * reset to master * remove dead code * comment * fix * Add test * comments * utc * format * small fix * Add test with rfc3339 * remove unused param * fix test * configurable state checkpointing * update test * fix type hints (#14352) * normalization: Do not return NULL for MySQL column values > 512 chars (#11694) Co-authored-by: Augustin <[email protected]> Co-authored-by: Edmundo Ruiz Ghanem <[email protected]> Co-authored-by: Evan Tahler <[email protected]> Co-authored-by: Tim Roes <[email protected]> Co-authored-by: Charles <[email protected]> Co-authored-by: Jonathan Pearlin <[email protected]> Co-authored-by: Amruta Ranade <[email protected]> Co-authored-by: Benoit Moriceau <[email protected]> Co-authored-by: Jimmy Ma <[email protected]> Co-authored-by: Ganpat Agarwal <[email protected]> Co-authored-by: Serhii Chvaliuk <[email protected]> Co-authored-by: Rajakavitha Kodhandapani <[email protected]> Co-authored-by: Yevhen Sukhomud <[email protected]> Co-authored-by: Andrii Leonets <[email protected]> Co-authored-by: George Claireaux <[email protected]> Co-authored-by: VitaliiMaltsev <[email protected]> Co-authored-by: Octavia Squidington III <[email protected]> Co-authored-by: sw-yx <[email protected]> Co-authored-by: Baz <[email protected]> Co-authored-by: Octavia Squidington III <[email protected]> Co-authored-by: alafanechere <[email protected]> Co-authored-by: Eugene <[email protected]> Co-authored-by: Denis Davydov <[email protected]> Co-authored-by: Anna Lvova <[email protected]> Co-authored-by: Vladimir <[email protected]> Co-authored-by: Phlair <[email protected]> Co-authored-by: Parker Mossman <[email protected]> Co-authored-by: Adam <[email protected]> Co-authored-by: Liren Tu <[email protected]> Co-authored-by: pmossman <[email protected]> Co-authored-by: Topher Lubaway <[email protected]> Co-authored-by: Brian Lai <[email protected]> Co-authored-by: Peter Hu <[email protected]> Co-authored-by: Subodh Kant Chaturvedi <[email protected]> Co-authored-by: Tuhai Maksym <[email protected]> Co-authored-by: Alexander Marquardt <[email protected]> Co-authored-by: sajarin <[email protected]> Co-authored-by: marcosmarxm <[email protected]> Co-authored-by: Alexandre Girard <[email protected]> Co-authored-by: steve withington <[email protected]> Co-authored-by: Leo Sussan <[email protected]> Co-authored-by: cenegd <[email protected]> Co-authored-by: Tomas Perez Alvarez <[email protected]> Co-authored-by: Lake Mossman <[email protected]> Co-authored-by: Sherif A. Nada <[email protected]> Co-authored-by: Edward Gao <[email protected]> Co-authored-by: Yurii Bidiuk <[email protected]> Co-authored-by: Christophe Duong <[email protected]> Co-authored-by: Teal Larson <[email protected]> Co-authored-by: Sophia Wiley <[email protected]> Co-authored-by: jdpgrailsdev <[email protected]> Co-authored-by: Jimmy Ma <[email protected]> Co-authored-by: Stella Chung <[email protected]> Co-authored-by: Amanda Murphy <[email protected]> Co-authored-by: Mohamed Magdy <[email protected]> Co-authored-by: nataly <[email protected]> Co-authored-by: Tyler Russell <[email protected]> Co-authored-by: Alexander Tsukanov <[email protected]> Co-authored-by: Pedro S. Lopez <[email protected]> Co-authored-by: brianjlai <[email protected]> Co-authored-by: terencecho <[email protected]> Co-authored-by: Davin Chia <[email protected]> Co-authored-by: terencecho <[email protected]> Co-authored-by: Daniel Diamond <[email protected]> Co-authored-by: drrest <[email protected]> Co-authored-by: Marcos Marx <[email protected]> Co-authored-by: Abhi Vaidyanatha <[email protected]> Co-authored-by: Harshith Mullapudi <[email protected]> Co-authored-by: Zawar Khan <[email protected]> Co-authored-by: ptiurin <[email protected]> Co-authored-by: Greg Solovyev <[email protected]> Co-authored-by: lmossman <[email protected]> Co-authored-by: sherifnada <[email protected]> Co-authored-by: Sachin Jangid <[email protected]> Co-authored-by: Chris Wu <[email protected]> Co-authored-by: Jared Rhizor <[email protected]> Co-authored-by: tison <[email protected]> Co-authored-by: Roberto Bonnet <[email protected]> Co-authored-by: Malik Diarra <[email protected]> Co-authored-by: alovew <[email protected]> Co-authored-by: Oleksandr Sheheda <[email protected]> Co-authored-by: midavadim <[email protected]> Co-authored-by: Arsen Losenko <[email protected]> Co-authored-by: Ryan Lewon <[email protected]> Co-authored-by: Mike Balmer <[email protected]> Co-authored-by: Anne <[email protected]> Co-authored-by: Liren Tu <[email protected]> Co-authored-by: Simon Späti <[email protected]> Co-authored-by: Albin Skott <[email protected]> Co-authored-by: Caleb Fornari <[email protected]> Co-authored-by: benmoriceau <[email protected]> Co-authored-by: Christian Martin <[email protected]> Co-authored-by: jordan-glitch <[email protected]> Co-authored-by: Daemonxiao <[email protected]> Co-authored-by: Keith Thompson <[email protected]> Co-authored-by: Leo Sussan <[email protected]> Co-authored-by: pedroslopez <[email protected]> Co-authored-by: timroes <[email protected]> Co-authored-by: Johannes Nicolai <[email protected]>
1 parent 3f001b1 commit 0232182

File tree

109 files changed

+3005
-327
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

109 files changed

+3005
-327
lines changed

airbyte-db/db-lib/src/main/java/io/airbyte/db/factory/DataSourceFactory.java

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -267,6 +267,11 @@ public DataSource build() {
267267
* will preserve existing behavior that tests for the connection on first use, not on creation.
268268
*/
269269
config.setInitializationFailTimeout(Integer.MIN_VALUE);
270+
/*
271+
* Default timeout is 30 sec, which is too short when you work with cloud data warehouses clusters
272+
* that can take 4-5 min to start up. Set it to 30 min to be sure
273+
*/
274+
config.setConnectionTimeout(30 * 60 * 1000);
270275

271276
connectionProperties.forEach(config::addDataSourceProperty);
272277

airbyte-integrations/bases/base-normalization/.dockerignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,4 +10,5 @@
1010
!dbt-project-template-oracle
1111
!dbt-project-template-clickhouse
1212
!dbt-project-template-snowflake
13+
!dbt-project-template-databricks
1314
!dbt-project-template-redshift

airbyte-integrations/bases/base-normalization/build.gradle

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,10 @@ task airbyteDockerSnowflake(type: Exec, dependsOn: checkSshScriptCopy) {
7575
configure buildAirbyteDocker('snowflake')
7676
dependsOn assemble
7777
}
78+
task airbyteDockerDatabricks(type: Exec, dependsOn: checkSshScriptCopy) {
79+
configure buildAirbyteDocker('databricks')
80+
dependsOn assemble
81+
}
7882
task airbyteDockerRedshift(type: Exec, dependsOn: checkSshScriptCopy) {
7983
configure buildAirbyteDocker('redshift')
8084
dependsOn assemble
@@ -85,6 +89,7 @@ airbyteDocker.dependsOn(airbyteDockerMySql)
8589
airbyteDocker.dependsOn(airbyteDockerOracle)
8690
airbyteDocker.dependsOn(airbyteDockerClickhouse)
8791
airbyteDocker.dependsOn(airbyteDockerSnowflake)
92+
airbyteDocker.dependsOn(airbyteDockerDatabricks)
8893
airbyteDocker.dependsOn(airbyteDockerRedshift)
8994

9095
task("customIntegrationTestPython", type: PythonTask, dependsOn: installTestReqs) {
@@ -100,6 +105,7 @@ task("customIntegrationTestPython", type: PythonTask, dependsOn: installTestReqs
100105
dependsOn ':airbyte-integrations:connectors:destination-oracle:airbyteDocker'
101106
dependsOn ':airbyte-integrations:connectors:destination-mssql:airbyteDocker'
102107
dependsOn ':airbyte-integrations:connectors:destination-clickhouse:airbyteDocker'
108+
dependsOn ':airbyte-integrations:connectors:destination-databricks:airbyteDocker'
103109
}
104110

105111
// DATs have some additional tests that exercise normalization code paths,
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
FROM fishtownanalytics/dbt:1.0.0
2+
COPY --from=airbyte/base-airbyte-protocol-python:0.1.1 /airbyte /airbyte
3+
4+
# Install SSH Tunneling dependencies
5+
RUN apt-get update && apt-get install -y jq sshpass
6+
7+
WORKDIR /airbyte
8+
COPY entrypoint.sh .
9+
COPY build/sshtunneling.sh .
10+
11+
WORKDIR /airbyte/normalization_code
12+
COPY normalization ./normalization
13+
COPY setup.py .
14+
COPY dbt-project-template/ ./dbt-template/
15+
COPY dbt-project-template-databricks/* ./dbt-template/
16+
17+
# Install python dependencies
18+
WORKDIR /airbyte/base_python_structs
19+
RUN pip install .
20+
21+
WORKDIR /airbyte/normalization_code
22+
RUN pip install .
23+
24+
WORKDIR /airbyte/normalization_code/dbt-template/
25+
# Download external dbt dependencies
26+
RUN pip install dbt-databricks==1.0.0
27+
RUN dbt deps
28+
29+
WORKDIR /airbyte
30+
ENV AIRBYTE_ENTRYPOINT "/airbyte/entrypoint.sh"
31+
ENTRYPOINT ["/airbyte/entrypoint.sh"]
32+
33+
LABEL io.airbyte.version=0.1.73
34+
LABEL io.airbyte.name=airbyte/normalization-databricks
Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
# This file is necessary to install dbt-utils with dbt deps
2+
# the content will be overwritten by the transform function
3+
4+
# Name your package! Package names should contain only lowercase characters
5+
# and underscores. A good package name should reflect your organization's
6+
# name or the intended use of these models
7+
name: "airbyte_utils"
8+
version: "1.0"
9+
config-version: 2
10+
11+
# This setting configures which "profile" dbt uses for this project. Profiles contain
12+
# database connection information, and should be configured in the ~/.dbt/profiles.yml file
13+
profile: "normalize"
14+
15+
# These configurations specify where dbt should look for different types of files.
16+
# The `model-paths` config, for example, states that source models can be found
17+
# in the "models/" directory. You probably won't need to change these!
18+
model-paths: ["models"]
19+
docs-paths: ["docs"]
20+
analysis-paths: ["analysis"]
21+
test-paths: ["tests"]
22+
seed-paths: ["data"]
23+
macro-paths: ["macros"]
24+
25+
target-path: "../build" # directory which will store compiled SQL files
26+
log-path: "../logs" # directory which will store DBT logs
27+
packages-install-path: "/tmp/dbt_modules" # directory which will store external DBT dependencies
28+
29+
clean-targets: # directories to be removed by `dbt clean`
30+
- "build"
31+
- "dbt_modules"
32+
33+
quoting:
34+
database: true
35+
# Temporarily disabling the behavior of the ExtendedNameTransformer on table/schema names, see (issue #1785)
36+
# all schemas should be unquoted
37+
schema: false
38+
identifier: false
39+
40+
# You can define configurations for models in the `model-paths` directory here.
41+
# Using these configurations, you can enable or disable models, change how they
42+
# are materialized, and more!
43+
models:
44+
+transient: false
45+
airbyte_utils:
46+
+materialized: table
47+
generated:
48+
airbyte_ctes:
49+
+tags: airbyte_internal_cte
50+
+materialized: ephemeral
51+
airbyte_incremental:
52+
+tags: incremental_tables
53+
+materialized: incremental
54+
+incremental_strategy: merge
55+
# schema change test is supported automatically by the merge operation
56+
# need to be run against a cluster with spark.databricks.delta.schema.autoMerge.enabled = True
57+
# schema merge being handled at the final step, if a schema changes in one of the primary keys
58+
# that coalesce differently to string, unicity will be broken
59+
+on_schema_change: "ignore"
60+
+file_format: delta
61+
+pre-hook: 'SET spark.databricks.delta.schema.autoMerge.enabled = True'
62+
airbyte_tables:
63+
+tags: normalized_tables
64+
+materialized: table
65+
+file_format: delta
66+
airbyte_views:
67+
+tags: airbyte_internal_views
68+
+materialized: view
69+
70+
dispatch:
71+
- macro_namespace: dbt_utils
72+
search_order: ["airbyte_utils", "dbt_utils"]

airbyte-integrations/bases/base-normalization/dbt-project-template/macros/cross_db_utils/array.sql

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
- postgres: unnest() -> https://www.postgresqltutorial.com/postgresql-array/
77
- MSSQL: openjson() –> https://docs.microsoft.com/en-us/sql/relational-databases/json/validate-query-and-change-json-data-with-built-in-functions-sql-server?view=sql-server-ver15
88
- ClickHouse: ARRAY JOIN> https://clickhouse.com/docs/zh/sql-reference/statements/select/array-join/
9+
- Databricks: LATERAL VIEW -> https://docs.databricks.com/spark/latest/spark-sql/language-manual/sql-ref-syntax-qry-select-lateral-view.html
910
#}
1011

1112
{# cross_join_unnest ------------------------------------------------- #}
@@ -50,6 +51,10 @@
5051
cross join table(flatten({{ array_col }})) as {{ array_col }}
5152
{%- endmacro %}
5253

54+
{% macro databricks__cross_join_unnest(stream_name, array_col) -%}
55+
lateral view outer explode(from_json({{ array_col }}, 'array<string>')) as _airbyte_nested_data
56+
{%- endmacro %}
57+
5358
{% macro sqlserver__cross_join_unnest(stream_name, array_col) -%}
5459
{# https://docs.microsoft.com/en-us/sql/relational-databases/json/convert-json-data-to-rows-and-columns-with-openjson-sql-server?view=sql-server-ver15#option-1---openjson-with-the-default-output #}
5560
CROSS APPLY (
@@ -87,6 +92,10 @@
8792
_airbyte_nested_data
8893
{%- endmacro %}
8994

95+
{% macro databricks__unnested_column_value(column_col) -%}
96+
_airbyte_nested_data
97+
{%- endmacro %}
98+
9099
{% macro oracle__unnested_column_value(column_col) -%}
91100
{{ column_col }}
92101
{%- endmacro %}

airbyte-integrations/bases/base-normalization/dbt-project-template/macros/cross_db_utils/columns.sql

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,3 +14,31 @@
1414
{% endcall %}
1515

1616
{% endmacro %}
17+
18+
{#
19+
This changes the behaviour of the default adapter macro, since DBT defaults to 256 when there are no explicit varchar limits
20+
(cf : https://github.com/dbt-labs/dbt-core/blob/3996a69861d5ba9a460092c93b7e08d8e2a63f88/core/dbt/adapters/base/column.py#L91)
21+
Since normalization code uses varchar for string type (and not text) on postgres, we need to set the max length possible when using unlimited varchars
22+
(cf : https://dba.stackexchange.com/questions/189876/size-limit-of-character-varying-postgresql)
23+
#}
24+
25+
{% macro postgres__get_columns_in_relation(relation) -%}
26+
{% call statement('get_columns_in_relation', fetch_result=True) %}
27+
select
28+
column_name,
29+
data_type,
30+
COALESCE(character_maximum_length, 10485760),
31+
numeric_precision,
32+
numeric_scale
33+
34+
from {{ relation.information_schema('columns') }}
35+
where table_name = '{{ relation.identifier }}'
36+
{% if relation.schema %}
37+
and table_schema = '{{ relation.schema }}'
38+
{% endif %}
39+
order by ordinal_position
40+
41+
{% endcall %}
42+
{% set table = load_result('get_columns_in_relation').table %}
43+
{{ return(sql_convert_columns_in_relation(table)) }}
44+
{% endmacro %}

airbyte-integrations/bases/base-normalization/dbt-project-template/macros/cross_db_utils/current_timestamp.sql

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,3 +5,7 @@
55
{% macro oracle__current_timestamp() %}
66
CURRENT_TIMESTAMP
77
{% endmacro %}
8+
9+
{% macro databricks__current_timestamp() %}
10+
CURRENT_TIMESTAMP
11+
{% endmacro %}

airbyte-integrations/bases/base-normalization/dbt-project-template/macros/cross_db_utils/datatypes.sql

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,10 @@
88
string
99
{% endmacro %}
1010

11+
{%- macro databricks__type_json() -%}
12+
string
13+
{%- endmacro -%}
14+
1115
{%- macro redshift__type_json() -%}
1216
{%- if redshift_super_type() -%}
1317
super
@@ -91,6 +95,10 @@
9195
INT
9296
{% endmacro %}
9397

98+
{% macro databricks__type_int() %}
99+
INT
100+
{% endmacro %}
101+
94102

95103
{# bigint ------------------------------------------------- #}
96104
{% macro mysql__type_bigint() %}
@@ -105,6 +113,10 @@
105113
BIGINT
106114
{% endmacro %}
107115

116+
{% macro databricks__type_bigint() %}
117+
BIGINT
118+
{% endmacro %}
119+
108120

109121
{# numeric ------------------------------------------------- --#}
110122
{% macro mysql__type_numeric() %}
@@ -115,6 +127,10 @@
115127
Float64
116128
{% endmacro %}
117129

130+
{% macro databricks__type_numeric() %}
131+
FLOAT
132+
{% endmacro %}
133+
118134

119135
{# timestamp ------------------------------------------------- --#}
120136
{% macro mysql__type_timestamp() %}
@@ -146,6 +162,12 @@
146162
timestamp
147163
{% endmacro %}
148164

165+
{#-- Spark timestamps are already 'point in time', even if converted / stored without the original tz info, relative to session tz --#}
166+
{#-- cf: https://docs.databricks.com/spark/latest/dataframes-datasets/dates-timestamps.html --#}
167+
{% macro databricks__type_timestamp_with_timezone() %}
168+
timestamp
169+
{% endmacro %}
170+
149171
{#-- MySQL doesnt allow cast operation to work with TIMESTAMP so we have to use char --#}
150172
{%- macro mysql__type_timestamp_with_timezone() -%}
151173
char

airbyte-integrations/bases/base-normalization/dbt-project-template/macros/cross_db_utils/json_operations.sql

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
- Postgres: json_extract_path_text(<from_json>, 'path' [, 'path' [, ...}}) -> https://www.postgresql.org/docs/12/functions-json.html
77
- MySQL: JSON_EXTRACT(json_doc, 'path' [, 'path'] ...) -> https://dev.mysql.com/doc/refman/8.0/en/json-search-functions.html
88
- ClickHouse: JSONExtractString(json_doc, 'path' [, 'path'] ...) -> https://clickhouse.com/docs/en/sql-reference/functions/json-functions/
9+
- Databricks: get_json_object(json_txt, 'path') -> https://spark.apache.org/docs/latest/api/sql/#get_json_object
910
#}
1011

1112
{# format_json_path -------------------------------------------------- #}
@@ -42,6 +43,15 @@
4243
{{ "'$.\"" ~ json_path_list|join(".") ~ "\"'" }}
4344
{%- endmacro %}
4445

46+
{% macro databricks__format_json_path(json_path_list) -%}
47+
{# -- '$.x.y.z' #}
48+
{%- set str_list = [] -%}
49+
{%- for json_path in json_path_list -%}
50+
{%- if str_list.append(json_path.replace("'", "\\'")) -%} {%- endif -%}
51+
{%- endfor -%}
52+
{{ "'$." ~ str_list|join(".") ~ "'" }}
53+
{%- endmacro %}
54+
4555
{% macro redshift__format_json_path(json_path_list) -%}
4656
{%- set quote = '"' if redshift_super_type() else "'" -%}
4757
{%- set str_list = [] -%}
@@ -86,6 +96,14 @@
8696
json_extract({{ from_table}}.{{ json_column }}, {{ format_json_path(json_path_list) }})
8797
{%- endmacro %}
8898

99+
{% macro databricks__json_extract(from_table, json_column, json_path_list, normalized_json_path) -%}
100+
{%- if from_table|string() == '' %}
101+
get_json_object({{ json_column }}, {{ format_json_path(json_path_list) }})
102+
{% else %}
103+
get_json_object({{ from_table }}.{{ json_column }}, {{ format_json_path(json_path_list) }})
104+
{% endif -%}
105+
{%- endmacro %}
106+
89107
{% macro oracle__json_extract(from_table, json_column, json_path_list, normalized_json_path) -%}
90108
json_value({{ json_column }}, {{ format_json_path(normalized_json_path) }})
91109
{%- endmacro %}
@@ -191,6 +209,10 @@
191209
JSONExtractRaw(assumeNotNull({{ json_column }}), {{ format_json_path(json_path_list) }})
192210
{%- endmacro %}
193211

212+
{% macro databricks__json_extract_scalar(json_column, json_path_list, normalized_json_path) -%}
213+
get_json_object({{ json_column }}, {{ format_json_path(json_path_list) }})
214+
{%- endmacro %}
215+
194216
{# json_extract_array ------------------------------------------------- #}
195217

196218
{% macro json_extract_array(json_column, json_path_list, normalized_json_path) -%}
@@ -237,6 +259,10 @@
237259
JSONExtractArrayRaw(assumeNotNull({{ json_column }}), {{ format_json_path(json_path_list) }})
238260
{%- endmacro %}
239261

262+
{% macro databricks__json_extract_array(json_column, json_path_list, normalized_json_path) -%}
263+
get_json_object({{ json_column }}, {{ format_json_path(json_path_list) }})
264+
{%- endmacro %}
265+
240266
{# json_extract_string_array ------------------------------------------------- #}
241267

242268
{% macro json_extract_string_array(json_column, json_path_list, normalized_json_path) -%}

airbyte-integrations/bases/base-normalization/dbt-project-template/macros/should_full_refresh.sql

Lines changed: 33 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,12 +4,43 @@
44
- the column _airbyte_ab_id does not exists in the normalized tables and make sure it is well populated.
55
#}
66

7+
{%- macro get_columns_in_relation_if_exist(target_table) -%}
8+
{{ return(adapter.dispatch('get_columns_in_relation_if_exist')(target_table)) }}
9+
{%- endmacro -%}
10+
11+
{%- macro default__get_columns_in_relation_if_exist(target_table) -%}
12+
{{ return(adapter.get_columns_in_relation(target_table)) }}
13+
{%- endmacro -%}
14+
15+
{%- macro databricks__get_columns_in_relation_if_exist(target_table) -%}
16+
{%- if target_table.schema is none -%}
17+
{%- set found_table = True %}
18+
{%- else -%}
19+
{% call statement('list_table_infos', fetch_result=True) -%}
20+
show tables in {{ target_table.schema }} like '*'
21+
{% endcall %}
22+
{%- set existing_tables = load_result('list_table_infos').table -%}
23+
{%- set found_table = [] %}
24+
{%- for table in existing_tables -%}
25+
{%- if table.tableName == target_table.identifier -%}
26+
{% do found_table.append(table.tableName) %}
27+
{%- endif -%}
28+
{%- endfor -%}
29+
{%- endif -%}
30+
{%- if found_table -%}
31+
{%- set cols = adapter.get_columns_in_relation(target_table) -%}
32+
{{ return(cols) }}
33+
{%- else -%}
34+
{{ return ([]) }}
35+
{%- endif -%}
36+
{%- endmacro -%}
37+
738
{%- macro need_full_refresh(col_ab_id, target_table=this) -%}
839
{%- if not execute -%}
940
{{ return(false) }}
1041
{%- endif -%}
1142
{%- set found_column = [] %}
12-
{%- set cols = adapter.get_columns_in_relation(target_table) -%}
43+
{%- set cols = get_columns_in_relation_if_exist(target_table) -%}
1344
{%- for col in cols -%}
1445
{%- if col.column == col_ab_id -%}
1546
{% do found_column.append(col.column) %}
@@ -18,7 +49,7 @@
1849
{%- if found_column -%}
1950
{{ return(false) }}
2051
{%- else -%}
21-
{{ dbt_utils.log_info(target_table ~ "." ~ col_ab_id ~ " does not exist yet. The table will be created or rebuilt with dbt.full_refresh") }}
52+
{{ dbt_utils.log_info(target_table ~ "." ~ col_ab_id ~ " does not exist. The table needs to be rebuilt in full_refresh") }}
2253
{{ return(true) }}
2354
{%- endif -%}
2455
{%- endmacro -%}

0 commit comments

Comments
 (0)