Skip to content

Add new index and cluster level settings to limit the total primary shards per node and per index #17295

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Feb 25, 2025

Conversation

pandeydivyansh1803
Copy link
Contributor

@pandeydivyansh1803 pandeydivyansh1803 commented Feb 7, 2025

Description

For remote store backed cluster, Segment Replication is used as the replication strategy. With segment replication, segments are created only on primary shard and these segments are copied to the replica shards. As segment creation is CPU intensive, we have observed CPU skew between nodes of the same cluster where primary shards are not balanced.

The earlier attempts to rebalance primary shards across nodes (#6422, #12250) are definitely helping to reduce the skew but they work on the best effort basis and don’t add any constraint.

Implement new setting in OpenSearch:
index.routing.allocation.total_primary_shards_per_node: An index-level setting to limit primary shards per node for a specific index. Store this limit (indexTotalPrimaryShardsPerNodeLimit) in index metadata, similar to indexTotalShardsPerNodeLimit.
cluster.routing.allocation.total_primary_shards_per_node: A cluster-level setting to limit the total primary shards on a node.

These settings will enhance control over primary shard distribution, improving cluster balance and performance management.
The existing ShardsLimitAllocationDecider class already contains the necessary infrastructure and logic to evaluate shard allocation constraints. It has access to the current cluster state, routing information, and methods to check shard counts per node. Given this existing functionality, we propose implementing the two new primary shard limit settings within this class. This approach leverages the current decision-making framework, ensuring consistency with existing allocation rules and minimizing code duplication. By extending the ShardsLimitAllocationDecider, we can efficiently integrate the new primary shard limit checks into the existing allocation decision process.

Related Issues

Resolves #17293

Check List

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Contributor

github-actions bot commented Feb 7, 2025

❌ Gradle check result for 721865e: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented Feb 7, 2025

❌ Gradle check result for 920f71a: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented Feb 9, 2025

✅ Gradle check result for ebb6a2b: SUCCESS

@linuxpi
Copy link
Contributor

linuxpi commented Feb 21, 2025

Please resolve conflicts in CHANGELOG-3.0.md and ensure gradle check is passing

Signed-off-by: Divyansh Pandey <[email protected]>
Copy link
Contributor

✅ Gradle check result for 1eb6f20: SUCCESS

Divyansh Pandey added 2 commits February 24, 2025 14:36
Copy link
Contributor

❕ Gradle check result for 28def93: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@linuxpi
Copy link
Contributor

linuxpi commented Feb 24, 2025

@pandeydivyansh1803 Changes LGTM. Please add some tests to cover the cases where user is not able to set the new index/cluster settings for non remote store cluster step.

…et for cluster which is not remote store enabled

Signed-off-by: Divyansh Pandey <[email protected]>
Copy link
Contributor

❌ Gradle check result for 36d29c8: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

✅ Gradle check result for 36d29c8: SUCCESS

@linuxpi linuxpi merged commit bc209ee into opensearch-project:main Feb 25, 2025
30 checks passed
@github-project-automation github-project-automation bot moved this from 👀 In review to ✅ Done in Storage Project Board Feb 25, 2025
@linuxpi linuxpi added the backport 2.x Backport to 2.x branch label Feb 26, 2025
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-17295-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 bc209ee6bacbb1027dcd7ba28d56b6ceb96f4fe0
# Push it to GitHub
git push --set-upstream origin backport/backport-17295-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-17295-to-2.x.

@linuxpi
Copy link
Contributor

linuxpi commented Feb 26, 2025

@pandeydivyansh1803 can you manually raise the backport PR?

pandeydivyansh1803 pushed a commit to pandeydivyansh1803/OpenSearchFork that referenced this pull request Feb 27, 2025
vinaykpud pushed a commit to vinaykpud/OpenSearch that referenced this pull request Mar 18, 2025
…hards per node and per index (opensearch-project#17295)

* Added a new index level setting to limit the total primary shards per index per node. Added relevant files for unit test and integration test.

Signed-off-by: Divyansh Pandey <[email protected]>

* update files for code quality

Signed-off-by: Divyansh Pandey <[email protected]>

* moved primary shard count function to RoutingNode.java

Signed-off-by: Divyansh Pandey <[email protected]>

* removed unwanted files

Signed-off-by: Divyansh Pandey <[email protected]>

* added cluster level setting to limit total primary shards per node

Signed-off-by: Divyansh Pandey <[email protected]>

* allow the index level settings to be applied to both DOCUMENT and SEGMENT replication indices

Signed-off-by: Divyansh Pandey <[email protected]>

* Added necessary validator to restrict the index and cluster level primary shards per node settings only for remote store enabled cluster. Added relevant unit and integration tests.

Signed-off-by: Divyansh Pandey <[email protected]>

* refactoring changes

Signed-off-by: Divyansh Pandey <[email protected]>

* refactoring changes

Signed-off-by: Divyansh Pandey <[email protected]>

* Empty commit to rerun gradle test

Signed-off-by: Divyansh Pandey <[email protected]>

* optimised the calculation of total primary shards on a node

Signed-off-by: Divyansh Pandey <[email protected]>

* Refactoring changes

Signed-off-by: Divyansh Pandey <[email protected]>

* refactoring changes, added TODO to MetadataCreateIndexService

Signed-off-by: Divyansh Pandey <[email protected]>

* Added integration test for scenario where primary shards setting is set for cluster which is not remote store enabled

Signed-off-by: Divyansh Pandey <[email protected]>

---------

Signed-off-by: Divyansh Pandey <[email protected]>
Signed-off-by: Divyansh Pandey <[email protected]>
Co-authored-by: Divyansh Pandey <[email protected]>
Signed-off-by: Vinay Krishna Pudyodu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch backport-failed enhancement Enhancement or improvement to existing feature or request _No response_
Projects
Status: ✅ Done
Development

Successfully merging this pull request may close these issues.

[Feature Request] Primary Shard Count Constraint
6 participants