Skip to content

Enable concurrent_segment_search auto mode by default #17978

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Apr 24, 2025

Conversation

Vikasht34
Copy link
Contributor

@Vikasht34 Vikasht34 commented Apr 16, 2025

Description

  1. Auto Mode for Concurrent Segment Search :- With this change by default behaviour for concurrent search will be changed to auto mode from disabled.
  2. Default Slicing :- Math.min(vCPU / 2, 4).

Related Issues

Resolves #17981

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Contributor

❌ Gradle check result for 8633e9f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@rishabh6788
Copy link
Contributor

{"run-benchmark-test": "id_4"}

Copy link
Contributor

The Jenkins job url is https://build.ci.opensearch.org/job/benchmark-pull-request/2882/ . Final results will be published once the job is completed.

Copy link
Contributor

✅ Gradle check result for af1ca8e: SUCCESS

@rishabh6788
Copy link
Contributor

{"run-benchmark-test": "id_6"}

Copy link
Contributor

The Jenkins job url is https://build.ci.opensearch.org/job/benchmark-pull-request/2959/ . Final results will be published once the job is completed.

@opensearch-ci-bot
Copy link
Collaborator

Benchmark Results

Benchmark Results for Job: https://build.ci.opensearch.org/job/benchmark-pull-request/2959/

Metric Task Value Unit
Cumulative indexing time of primary shards 142.234 min
Min cumulative indexing time across primary shards 142.234 min
Median cumulative indexing time across primary shards 142.234 min
Max cumulative indexing time across primary shards 142.234 min
Cumulative indexing throttle time of primary shards 0 min
Min cumulative indexing throttle time across primary shards 0 min
Median cumulative indexing throttle time across primary shards 0 min
Max cumulative indexing throttle time across primary shards 0 min
Cumulative merge time of primary shards 101.934 min
Cumulative merge count of primary shards 85
Min cumulative merge time across primary shards 101.934 min
Median cumulative merge time across primary shards 101.934 min
Max cumulative merge time across primary shards 101.934 min
Cumulative merge throttle time of primary shards 15.3141 min
Min cumulative merge throttle time across primary shards 15.3141 min
Median cumulative merge throttle time across primary shards 15.3141 min
Max cumulative merge throttle time across primary shards 15.3141 min
Cumulative refresh time of primary shards 3.49752 min
Cumulative refresh count of primary shards 111
Min cumulative refresh time across primary shards 3.49752 min
Median cumulative refresh time across primary shards 3.49752 min
Max cumulative refresh time across primary shards 3.49752 min
Cumulative flush time of primary shards 12.8786 min
Cumulative flush count of primary shards 95
Min cumulative flush time across primary shards 12.8786 min
Median cumulative flush time across primary shards 12.8786 min
Max cumulative flush time across primary shards 12.8786 min
Total Young Gen GC time 3.676 s
Total Young Gen GC count 110
Total Old Gen GC time 0 s
Total Old Gen GC count 0
Store size 35.9748 GB
Translog size 5.12227e-08 GB
Heap used for segments 0 MB
Heap used for doc values 0 MB
Heap used for terms 0 MB
Heap used for norms 0 MB
Heap used for points 0 MB
Heap used for stored fields 0 MB
Segment count 15
Min Throughput index-append 15671.7 docs/s
Mean Throughput index-append 16290.2 docs/s
Median Throughput index-append 16177.2 docs/s
Max Throughput index-append 17793.2 docs/s
50th percentile latency index-append 2030.2 ms
90th percentile latency index-append 3427.34 ms
99th percentile latency index-append 8606.41 ms
99.9th percentile latency index-append 11201.4 ms
100th percentile latency index-append 13358.5 ms
50th percentile service time index-append 2030.6 ms
90th percentile service time index-append 3427.11 ms
99th percentile service time index-append 8605.59 ms
99.9th percentile service time index-append 11201.4 ms
100th percentile service time index-append 13358.5 ms
error rate index-append 0 %
Min Throughput wait-until-merges-finish 0 ops/s
Mean Throughput wait-until-merges-finish 0 ops/s
Median Throughput wait-until-merges-finish 0 ops/s
Max Throughput wait-until-merges-finish 0 ops/s
100th percentile latency wait-until-merges-finish 680433 ms
100th percentile service time wait-until-merges-finish 680433 ms
error rate wait-until-merges-finish 0 %

Copy link
Contributor

❌ Gradle check result for 932fa0f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@bowenlan-amzn
Copy link
Member

@github-actions commented on Apr 23, 2025, 2:49 PM PDT:

❌ Gradle check result for 932fa0f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Originally posted by @github-actions[bot] in #17978 (comment)

Flaky test #16576

Copy link
Contributor

✅ Gradle check result for 932fa0f: SUCCESS

Copy link
Contributor

❌ Gradle check result for 1925434: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

✅ Gradle check result for 1925434: SUCCESS

@sohami sohami merged commit e7ed33f into opensearch-project:main Apr 24, 2025
29 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 3.0 failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-3.0 3.0
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-3.0
# Create a new branch
git switch --create backport/backport-17978-to-3.0
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 e7ed33f60d275c3482ee7b82007ff8bdb00dbe90
# Push it to GitHub
git push --set-upstream origin backport/backport-17978-to-3.0
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-3.0

Then, create a pull request where the base branch is 3.0 and the compare/head branch is backport/backport-17978-to-3.0.

bowenlan-amzn added a commit to bowenlan-amzn/OpenSearch that referenced this pull request Apr 24, 2025
…ject#17978)

* Enable concurrent_segment_search auto mode by default

Signed-off-by: Vikasht34 <[email protected]>

* Make Default Slice count to 1 for Non-Concurrent Path

Signed-off-by: Vikasht34 <[email protected]>

* Add tolerance to matrix_stats agg correlation value assertion

The correlation metric could be different for different document distribution across shards, or slices. slice1(doc1,doc2), slice2(doc3,doc4,doc5) could give different correlation from slice1(doc1,doc2,doc3), slice2(doc4,doc5)

The tolerance followed here is 0.000000000000001

Signed-off-by: bowenlan-amzn <[email protected]>

---------

Signed-off-by: Vikasht34 <[email protected]>
Signed-off-by: bowenlan-amzn <[email protected]>
Co-authored-by: bowenlan-amzn <[email protected]>
jainankitk pushed a commit that referenced this pull request Apr 24, 2025
* Enable concurrent_segment_search auto mode by default



* Make Default Slice count to 1 for Non-Concurrent Path



* Add tolerance to matrix_stats agg correlation value assertion

The correlation metric could be different for different document distribution across shards, or slices. slice1(doc1,doc2), slice2(doc3,doc4,doc5) could give different correlation from slice1(doc1,doc2,doc3), slice2(doc4,doc5)

The tolerance followed here is 0.000000000000001



---------

Signed-off-by: Vikasht34 <[email protected]>
Signed-off-by: bowenlan-amzn <[email protected]>
Co-authored-by: Vikasht34 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 3.0 backport-failed enhancement Enhancement or improvement to existing feature or request Search:Performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] Enable concurrent segment search by default in auto mode
6 participants