Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3 Destination: Descreased thread allocation & memory ratio for AsyncConsumer #43714

Merged
merged 2 commits into from
Aug 13, 2024

Conversation

johnny-schmidt
Copy link
Contributor

@johnny-schmidt johnny-schmidt commented Aug 11, 2024

This issue

Some configurations OOM consistently in the cloud. I can't find evidence this is affecting customers, but bumping memory to 2G (same as other async destinations) seems to enable it to succeed.

There are three changes here:

  • increase memory requirements to 2G
  • reduce the memory ratio to 0.5
  • additional changes to make it possible to tweak memory ratio and concurrency entirely with a connector release (no CDK)

I built a dev connector off this, and it allows even the most demanding loads to make steady progress.

@johnny-schmidt johnny-schmidt requested a review from a team as a code owner August 11, 2024 17:39
Copy link

vercel bot commented Aug 11, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview Aug 13, 2024 0:34am

@octavia-squidington-iii octavia-squidington-iii added area/connectors Connector related issues area/documentation Improvements or additions to documentation connectors/destination/s3 labels Aug 11, 2024
@johnny-schmidt johnny-schmidt requested a review from a team as a code owner August 11, 2024 18:03
@octavia-squidington-iii octavia-squidington-iii added the CDK Connector Development Kit label Aug 11, 2024
@johnny-schmidt johnny-schmidt force-pushed the issue-43713/s3-destination-oom branch from 57bce43 to 9e638f7 Compare August 11, 2024 18:04
@johnny-schmidt johnny-schmidt requested a review from gisripa August 11, 2024 19:12
@johnny-schmidt johnny-schmidt force-pushed the issue-43713/s3-destination-oom branch from 9e638f7 to 1310452 Compare August 11, 2024 19:15
@johnny-schmidt
Copy link
Contributor Author

Successful sync with this config: 5e7102b0b1e7e19b14650c8b0c6b976f95687079!

Copy link
Contributor

@evantahler evantahler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, the way that connectors work is that they work in lower-memory environments, but they go slower (e.g. smaller batches or less parallelism). This is good both for our cloud costs, and for self-hosted users that don't have large K8s clusters.

As most syncs seem to work with the 1gb setting, I'd suggest that we delay rolling this out for all cloud syncs unless there really is no other option. E.g. Maybe the parallelism for the S3 destination needs to be less, or the % of memory we allocate to an async worker should be increased?

[edit] somehow I was commenting on some phantom version of this PR that didn't also include the thread / memory % changes. Are just those enough, or do we also need the 2gb bump for sync to succeed?

protected val environment: Map<String, String> = System.getenv()
protected val environment: Map<String, String> = System.getenv(),
private val memoryRatio: Double = 0.5,
private val nThreads: Int = 5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be some function of the number of CPUs that are allocated to the container?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, but it's hardcoded to five currently. Also according to the resource runbook, the syncs only have 1 core by default? I assumed the point was to parallelize workloads that we assumed would be network bound (but because we're staging everything to a local file before syncing, that's probably less true?)

@johnny-schmidt
Copy link
Contributor Author

@evantahler I reverted the memory increase request. Now this change

  • reduces the memory ratio from 0.7 to 0.5
  • reduces the number of worker threads from 5 to 1

I will confirm this works for all configurations against the benchmark w/o significant perf reduction.

Copy link
Contributor

@gisripa gisripa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No blocking comments,
Not sure if the having a default of Runtime.availableProcessors or something is sane or even that is not good for Parquet.

@johnny-schmidt
Copy link
Contributor Author

No blocking comments, Not sure if the having a default of Runtime.availableProcessors or something is sane or even that is not good for Parquet.

@gisripa judging by the orchestrator logs, that's going to be 1 anyway

@johnny-schmidt johnny-schmidt force-pushed the issue-43713/s3-destination-oom branch from ca03764 to 4d99c05 Compare August 12, 2024 22:01
@johnny-schmidt johnny-schmidt changed the title S3 Destination: Increased Memory for AsyncConsumer S3 Destination: Descreased thread allocation & memory ratio for AsyncConsumer Aug 12, 2024
@johnny-schmidt johnny-schmidt force-pushed the issue-43713/s3-destination-oom branch from 4d99c05 to d51d428 Compare August 12, 2024 22:04
@johnny-schmidt johnny-schmidt merged commit 4c4a105 into master Aug 13, 2024
37 of 40 checks passed
@johnny-schmidt johnny-schmidt deleted the issue-43713/s3-destination-oom branch August 13, 2024 15:24
LouisAuneau pushed a commit to LouisAuneau/airbyte that referenced this pull request Aug 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation CDK Connector Development Kit connectors/destination/s3
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants