Skip to content

🎉 Snowflake destination: reduce memory footprint #10394

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Feb 17, 2022

Conversation

tuliren
Copy link
Contributor

@tuliren tuliren commented Feb 17, 2022

@github-actions github-actions bot added area/connectors Connector related issues area/platform issues related to the platform area/worker Related to worker labels Feb 17, 2022
@tuliren tuliren temporarily deployed to more-secrets February 17, 2022 04:14 Inactive
@tuliren tuliren temporarily deployed to more-secrets February 17, 2022 04:14 Inactive
@tuliren tuliren temporarily deployed to more-secrets February 17, 2022 04:17 Inactive
@tuliren tuliren temporarily deployed to more-secrets February 17, 2022 04:17 Inactive
@tuliren tuliren temporarily deployed to more-secrets February 17, 2022 05:30 Inactive
@tuliren tuliren temporarily deployed to more-secrets February 17, 2022 05:30 Inactive
@tuliren tuliren changed the title Investigate snowflake destination memory usage 🎉 Snowflake destination: reduce memory footprint Feb 17, 2022
@tuliren tuliren temporarily deployed to more-secrets February 17, 2022 07:44 Inactive
@tuliren tuliren temporarily deployed to more-secrets February 17, 2022 07:44 Inactive
@tuliren tuliren temporarily deployed to more-secrets February 17, 2022 08:50 Inactive
@tuliren tuliren temporarily deployed to more-secrets February 17, 2022 08:50 Inactive
@tuliren tuliren temporarily deployed to more-secrets February 17, 2022 09:03 Inactive
@tuliren tuliren temporarily deployed to more-secrets February 17, 2022 09:03 Inactive
@tuliren tuliren requested a review from subodh1810 February 17, 2022 09:04
@github-actions github-actions bot added the area/documentation Improvements or additions to documentation label Feb 17, 2022
@tuliren tuliren temporarily deployed to more-secrets February 17, 2022 09:05 Inactive
@tuliren tuliren temporarily deployed to more-secrets February 17, 2022 09:05 Inactive
@octavia-squidington-iii octavia-squidington-iii temporarily deployed to more-secrets February 17, 2022 09:06 Inactive
@tuliren tuliren temporarily deployed to more-secrets February 17, 2022 09:35 Inactive
@tuliren tuliren temporarily deployed to more-secrets February 17, 2022 09:36 Inactive
Copy link
Contributor

@subodh1810 subodh1810 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am not sure I understand how this will reduce heap consumption. The only difference I see is that with previous logic we were calculating the size for each record using s.getBytes(StandardCharsets.UTF_8).length; but with new logic we do it via Jsons.serialize(data).length() * 4L but for every 20th record. How does this help with lesser heap consumption?

@tuliren
Copy link
Contributor Author

tuliren commented Feb 17, 2022

@subodh1810, calculating the bytes of the serialized json strings creates lots of byte array objects. So switching away from generating the byte array is the fix. Before this change, the connector will run out of memory when the max heap size is 500 MB. After this change, it works even with just 300 MB heap size.

See here for raw data.

@tuliren
Copy link
Contributor Author

tuliren commented Feb 17, 2022

/publish connector=connectors/destination-snowflake

🕑 connectors/destination-snowflake https://github.com/airbytehq/airbyte/actions/runs/1860490882
❌ connectors/destination-snowflake https://github.com/airbytehq/airbyte/actions/runs/1860490882

@octavia-squidington-iii octavia-squidington-iii temporarily deployed to more-secrets February 17, 2022 18:55 Inactive
@tuliren
Copy link
Contributor Author

tuliren commented Feb 17, 2022

/publish connector=connectors/destination-snowflake

🕑 connectors/destination-snowflake https://github.com/airbytehq/airbyte/actions/runs/1860626168
✅ connectors/destination-snowflake https://github.com/airbytehq/airbyte/actions/runs/1860626168

@tuliren tuliren temporarily deployed to more-secrets February 17, 2022 19:24 Inactive
@tuliren tuliren temporarily deployed to more-secrets February 17, 2022 19:24 Inactive
@octavia-squidington-iii octavia-squidington-iii temporarily deployed to more-secrets February 17, 2022 19:25 Inactive
@github-actions github-actions bot added the CDK Connector Development Kit label Feb 17, 2022
@tuliren tuliren temporarily deployed to more-secrets February 17, 2022 19:59 Inactive
@tuliren tuliren temporarily deployed to more-secrets February 17, 2022 19:59 Inactive
@tuliren tuliren merged commit 049a11b into master Feb 17, 2022
@tuliren tuliren deleted the liren/snowflake-memory-investigation branch February 17, 2022 20:55
@tuliren tuliren temporarily deployed to more-secrets February 17, 2022 20:56 Inactive
@tuliren tuliren temporarily deployed to more-secrets February 17, 2022 20:56 Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation area/platform issues related to the platform area/worker Related to worker CDK Connector Development Kit
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants