-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🧱 private preview with Unity catalog and Oauth using service principal #37613
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
...ain/kotlin/io/airbyte/integrations/destination/databricks/sync/DatabricksStreamOperations.kt
Outdated
Show resolved
Hide resolved
applicationDefaultJvmArgs = ['-XX:+ExitOnOutOfMemoryError', '-XX:MaxRAMPercentage=75.0'] | ||
mainClass = 'io.airbyte.integrations.destination.databricks.DatabricksDestinationKt' | ||
applicationDefaultJvmArgs = ['-XX:+ExitOnOutOfMemoryError', '-XX:MaxRAMPercentage=75.0', | ||
'-XX:NativeMemoryTracking=detail', '-XX:+UnlockDiagnosticVMOptions', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add some comment as to why this is here
import io.airbyte.integrations.destination.databricks.model.OAuth2Authentication | ||
import javax.sql.DataSource | ||
|
||
object ConnectorClientsFactory { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do we think about always prefixing our connector-specific classes with the connector name (DatabricksConnectorClientFactory
here) ?
connectorConfig.authentication | ||
) | ||
val storageOperations = | ||
DatabricksStorageOperations( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dumb question: is there any case where we'll use the DatabricksStorageOperations
with anything else than those particular types? Should we just create them inside that constructor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
injected as dependency for mocking them, although didn't write unit test in this 😅 . Will do that in CDK PR when doing the bigquery thing now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'll want to experiment with how we test these components. E.g. do we ever want to test sqlgenerator/destinationhandler/etc in isolation, or are we always just testing the entire StorageOperations object?
(my personal guess is that we'll mostly just run integration tests against FooStorageOperations - e.g. our BaseSqlGeneratorIntegrationTest is also expanding to exercise destinationhandler stuff)
also, up for discovery: is it actually just StagingDestinationStorageOperations, and we want to inject databricks stuff through the connector? Or is there actual databricks-specific stuff directly in the StorageOperations impl? (I don't know the answer, this is something we should figure out)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, if we can eventually fold our SqlGen tests as StorageOps tests. TBH it needs for injection are ConnectorConfig static info like schema, catalog, jdbcurl and both SqlGen and DestHandler can be internally instantiated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only skimmed this, fully expecting us to make a lot of changes while we're working on the bigquery stuff. Just a couple nits/info comments.
@@ -23,4 +23,5 @@ data: | |||
sl: 100 | |||
ql: 100 | |||
supportLevel: community | |||
supportsDbt: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want this? my understanding (80%) is that this triggers platform's custom dbt transform thing, which they recently killed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy-pasta. I'll remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🪓
import io.airbyte.commons.jackson.MoreMappers | ||
|
||
data class DatabricksConnectorConfig( | ||
@JsonProperty("accept_terms") val termsAccepted: Boolean, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: our spec.json already has an explicit title
field to control what the UI, why not just use camelCase for the object keys?
connectorConfig.authentication | ||
) | ||
val storageOperations = | ||
DatabricksStorageOperations( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'll want to experiment with how we test these components. E.g. do we ever want to test sqlgenerator/destinationhandler/etc in isolation, or are we always just testing the entire StorageOperations object?
(my personal guess is that we'll mostly just run integration tests against FooStorageOperations - e.g. our BaseSqlGeneratorIntegrationTest is also expanding to exercise destinationhandler stuff)
also, up for discovery: is it actually just StagingDestinationStorageOperations, and we want to inject databricks stuff through the connector? Or is there actual databricks-specific stuff directly in the StorageOperations impl? (I don't know the answer, this is something we should figure out)
...in/kotlin/io/airbyte/integrations/destination/databricks/jdbc/DatabricksStorageOperations.kt
Outdated
Show resolved
Hide resolved
airbyte-integrations/connectors/destination-databricks/metadata.yaml
Outdated
Show resolved
Hide resolved
@@ -15,12 +15,11 @@ | |||
plugins { | |||
id 'application' | |||
id 'airbyte-java-connector' | |||
id "de.undercouch.download" version "5.0.1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lol, what did this do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😂 no clue
@@ -23,4 +23,5 @@ data: | |||
sl: 100 | |||
ql: 100 | |||
supportLevel: community | |||
supportsDbt: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🪓
@@ -344,7 +344,8 @@ Delta Lake tables are created. You may want to consult the tutorial on | |||
## CHANGELOG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There will be a lot more docs to update once we are ready to release - fine to skip that for now
Review guide
Ignore deletes in
src/main/java
,/src/test/java
,/src/test-integration/java
. Old code deleted.Databricks destination using AsyncConsumer, V2 column names, Stripped down version of classes needed for Staging and TypeDedupe.
User Impact
This is a breaking change and a private preview release. Destination only supports CSV file format at the moment and works with Unity catalog for tables and Volumes. Cloud and OSS are pinned to 1.1.0 which was the last known stable release of the community connector.
🥳
Can this PR be safely reverted and rolled back?