-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Add new index and cluster level settings to limit the total primary shards per node and per index #17295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
❌ Gradle check result for 721865e: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 920f71a: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Please resolve conflicts in |
Signed-off-by: Divyansh Pandey <[email protected]>
server/src/main/java/org/opensearch/cluster/metadata/MetadataCreateIndexService.java
Show resolved
Hide resolved
...ain/java/org/opensearch/cluster/routing/allocation/decider/ShardsLimitAllocationDecider.java
Show resolved
Hide resolved
...ain/java/org/opensearch/cluster/routing/allocation/decider/ShardsLimitAllocationDecider.java
Show resolved
Hide resolved
...ain/java/org/opensearch/cluster/routing/allocation/decider/ShardsLimitAllocationDecider.java
Show resolved
Hide resolved
Signed-off-by: Divyansh Pandey <[email protected]>
…chFork Merge main to sync changelog updates with local changes.
❕ Gradle check result for 28def93: UNSTABLE Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
@pandeydivyansh1803 Changes LGTM. Please add some tests to cover the cases where user is not able to set the new index/cluster settings for non remote store cluster step. |
...rch/cluster/routing/allocation/decider/ShardsLimitAllocationDeciderRemoteStoreEnabledIT.java
Show resolved
Hide resolved
…et for cluster which is not remote store enabled Signed-off-by: Divyansh Pandey <[email protected]>
❌ Gradle check result for 36d29c8: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
The backport to
To backport manually, run these commands in your terminal: # Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-17295-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 bc209ee6bacbb1027dcd7ba28d56b6ceb96f4fe0
# Push it to GitHub
git push --set-upstream origin backport/backport-17295-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x Then, create a pull request where the |
@pandeydivyansh1803 can you manually raise the backport PR? |
Signed-off-by: Divyansh Pandey <[email protected]>
…hards per node and per index (opensearch-project#17295) * Added a new index level setting to limit the total primary shards per index per node. Added relevant files for unit test and integration test. Signed-off-by: Divyansh Pandey <[email protected]> * update files for code quality Signed-off-by: Divyansh Pandey <[email protected]> * moved primary shard count function to RoutingNode.java Signed-off-by: Divyansh Pandey <[email protected]> * removed unwanted files Signed-off-by: Divyansh Pandey <[email protected]> * added cluster level setting to limit total primary shards per node Signed-off-by: Divyansh Pandey <[email protected]> * allow the index level settings to be applied to both DOCUMENT and SEGMENT replication indices Signed-off-by: Divyansh Pandey <[email protected]> * Added necessary validator to restrict the index and cluster level primary shards per node settings only for remote store enabled cluster. Added relevant unit and integration tests. Signed-off-by: Divyansh Pandey <[email protected]> * refactoring changes Signed-off-by: Divyansh Pandey <[email protected]> * refactoring changes Signed-off-by: Divyansh Pandey <[email protected]> * Empty commit to rerun gradle test Signed-off-by: Divyansh Pandey <[email protected]> * optimised the calculation of total primary shards on a node Signed-off-by: Divyansh Pandey <[email protected]> * Refactoring changes Signed-off-by: Divyansh Pandey <[email protected]> * refactoring changes, added TODO to MetadataCreateIndexService Signed-off-by: Divyansh Pandey <[email protected]> * Added integration test for scenario where primary shards setting is set for cluster which is not remote store enabled Signed-off-by: Divyansh Pandey <[email protected]> --------- Signed-off-by: Divyansh Pandey <[email protected]> Signed-off-by: Divyansh Pandey <[email protected]> Co-authored-by: Divyansh Pandey <[email protected]> Signed-off-by: Vinay Krishna Pudyodu <[email protected]>
Description
For remote store backed cluster, Segment Replication is used as the replication strategy. With segment replication, segments are created only on primary shard and these segments are copied to the replica shards. As segment creation is CPU intensive, we have observed CPU skew between nodes of the same cluster where primary shards are not balanced.
The earlier attempts to rebalance primary shards across nodes (#6422, #12250) are definitely helping to reduce the skew but they work on the best effort basis and don’t add any constraint.
Implement new setting in OpenSearch:
index.routing.allocation.total_primary_shards_per_node
: An index-level setting to limit primary shards per node for a specific index. Store this limit (indexTotalPrimaryShardsPerNodeLimit) in index metadata, similar to indexTotalShardsPerNodeLimit.cluster.routing.allocation.total_primary_shards_per_node
: A cluster-level setting to limit the total primary shards on a node.These settings will enhance control over primary shard distribution, improving cluster balance and performance management.
The existing ShardsLimitAllocationDecider class already contains the necessary infrastructure and logic to evaluate shard allocation constraints. It has access to the current cluster state, routing information, and methods to check shard counts per node. Given this existing functionality, we propose implementing the two new primary shard limit settings within this class. This approach leverages the current decision-making framework, ensuring consistency with existing allocation rules and minimizing code duplication. By extending the ShardsLimitAllocationDecider, we can efficiently integrate the new primary shard limit checks into the existing allocation decision process.
Related Issues
Resolves #17293
Check List
index.routing.allocation.total_primary_shards_per_node
,index.routing.allocation.total_shards_per_node
,cluster.routing.allocation.total_primary_shards_per_node
andcluster.routing.allocation.total_shards_per_node
documentation-website#9301)By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.