-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Destination Snowflake Execute COPY in parallel #10212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Destination Snowflake Execute COPY in parallel #10212
Conversation
/test connector=connectors/destination-snowflake
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
two final comments, but otherwise this lgtm!
import java.util.concurrent.ExecutorService; | ||
import java.util.stream.IntStream; | ||
|
||
interface SnowflakeParallelCopyStreamCopier { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add javadoc comments to the methods in this interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add javadoc comments to the methods in this interface?
done
...ke/src/main/java/io/airbyte/integrations/destination/snowflake/SnowflakeGcsStreamCopier.java
Outdated
Show resolved
Hide resolved
/test connector=connectors/destination-snowflake
|
/publish connector=connectors/destination-snowflake
|
/publish connector=connectors/destination-snowflake
|
What
Currently, destination-snowflake's copy modes will generate multiple files on S3 or GCS, and then for each of those files, execute a COPY command in serial. We should run those commands in parallel to be more time-efficient.
How
COPY command can accept up to 1,000 files.
Recommended reading order
SnowflakeParallelCopyStreamCopier.java
SnowflakeS3StreamCopier.java
SnowflakeGCSStreamCopier.java
🚨 User Impact 🚨
there should not be visible user impact
Pre-merge Checklist
Expand the relevant checklist and delete the others.
New Connector
Community member or Airbyter
airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
bootstrap.md
. See description and examplesdocs/SUMMARY.md
docs/integrations/<source or destination>/<name>.md
including changelog. See changelog exampledocs/integrations/README.md
airbyte-integrations/builds.md
Airbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>
command is passing./publish
command described hereUpdating a connector
Community member or Airbyter
airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
bootstrap.md
. See description and examplesdocs/integrations/<source or destination>/<name>.md
including changelog. See changelog exampleAirbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>
command is passing./publish
command described hereConnector Generator
-scaffold
in their name) have been updated with the latest scaffold by running./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplates
then checking in your changes