Skip to content

[Feature][Manager][Sort] Resource adaptive adjustment for Hudi #7072

Closed
@featzhang

Description

@featzhang

Description

Hudi flink jobs often have unreasonable resource allocation. Too much allocation will lead to a waste of resources, and too little will lead to back pressure or OOM.

When allocating resources, you first need to determine the concurrency of the source side to ensure that there is no data backlog upstream when reading. Here is a general configuration situation, such as partitioning by day, with about 15 billion data per day, and about 50 concurrent configurations. Other data volumes can be converted appropriately.

After determining the concurrency on the source side, you can configure the concurrency of write according to the ratio of 1:1.5 or 1:2.

Use case

No response

Are you willing to submit PR?

  • Yes, I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    stage/staleIssues or PRs that had no activity for a long timetype/feature

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions