Optimize ParallelLeafReader to improve term vector fetching efficienc #14373

DivyanshIITB · 2025-03-19T14:30:37Z

This PR optimizes ParallelLeafReader to avoid redundant term vector fetching.

Replaces per-field term vector fetching with a single call per reader.
Reduces complexity from O(n^2) to O(n).
Improves performance when handling large numbers of fields.
Verified via existing tests.

DivyanshIITB · 2025-03-20T16:59:05Z

Just a gentle reminder
@vigyasharma

vigyasharma

Changes look good to me. Can you run ./gradlew tidy to fix formatting issues, and add a changes entry before we merge this?

lucene/core/src/java/org/apache/lucene/index/ParallelLeafReader.java

…pache#14373)

DivyanshIITB · 2025-03-22T09:42:30Z

Changes look good to me. Can you run ./gradlew tidy to fix formatting issues, and add a changes entry before we merge this?

I successfully ran ./gradlew tidy and the built was successful.
Also added the changes entry in CHANGES.txt

vigyasharma · 2025-03-22T23:30:22Z

lucene/CHANGES.txt

+- Fetches all term vectors once per reader instead of per field.
+- Reduces complexity from **O(n²) to O(n)**.
+- Enhances performance for documents with many fields. (Divyansh Agrawal)


We generally keep a single bullet per changes entry. Details are already available in the pull request that the entry points too.

I have modified CHANGES.txt as you said.

vigyasharma · 2025-03-22T23:32:25Z

I successfully ran ./gradlew tidy and the built was successful.

Github build is still failing on spotless (formatting). tidy will change and reformat offending files for you, you need to commit and push those changes.

DivyanshIITB · 2025-03-23T14:28:02Z

I successfully ran ./gradlew tidy and the built was successful.

Github build is still failing on spotless (formatting). tidy will change and reformat offending files for you, you need to commit and push those changes.

Thanks for the review @vigyasharma !
I have run ./gradlew tidy and pushed the formatting fixes. Let me know if there's anything else needed.

#14373)

vigyasharma · 2025-03-24T06:52:05Z

Changes merged. Thanks @DivyanshIITB !

apache#14373)

Optimize ParallelLeafReader to improve term vector fetching efficiency

0845317

github-project-automation bot added this to OpenSearch Lucene & Core Performance Tracking Mar 19, 2025

github-project-automation bot moved this to Open in OpenSearch Lucene & Core Performance Tracking Mar 19, 2025

github-actions bot added the module:core/index label Mar 19, 2025

vigyasharma reviewed Mar 22, 2025

View reviewed changes

lucene/core/src/java/org/apache/lucene/index/ParallelLeafReader.java Outdated Show resolved Hide resolved

Update CHANGES.txt: Optimize ParallelLeafReader term vector fetching (a…

0f377f4

…pache#14373)

vigyasharma reviewed Mar 22, 2025

View reviewed changes

Fixed formatting using ./gradlew tidy

3e3516b

vigyasharma approved these changes Mar 24, 2025

View reviewed changes

vigyasharma merged commit 9272d4d into apache:main Mar 24, 2025
7 checks passed

github-project-automation bot moved this from Open to Merged in OpenSearch Lucene & Core Performance Tracking Mar 24, 2025

vigyasharma added a commit that referenced this pull request Mar 24, 2025

fix changes entry for #14373

37ab3c3

vigyasharma pushed a commit that referenced this pull request Mar 24, 2025

Optimize ParallelLeafReader to improve term vector fetching efficiency (

555efca

#14373)

vigyasharma added a commit that referenced this pull request Mar 24, 2025

fix changes entry for #14373

5e098c1

jpountz pushed a commit to jpountz/lucene that referenced this pull request Mar 24, 2025

Optimize ParallelLeafReader to improve term vector fetching efficiency (

c6b1f3c

apache#14373)

jpountz pushed a commit to jpountz/lucene that referenced this pull request Mar 24, 2025

fix changes entry for apache#14373

9c338e4

jpountz pushed a commit to jpountz/lucene that referenced this pull request Mar 24, 2025

Optimize ParallelLeafReader to improve term vector fetching efficiency (

7262376

apache#14373)

jpountz pushed a commit to jpountz/lucene that referenced this pull request Mar 24, 2025

fix changes entry for apache#14373

f77a0cb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize ParallelLeafReader to improve term vector fetching efficienc #14373

Optimize ParallelLeafReader to improve term vector fetching efficienc #14373

Uh oh!

DivyanshIITB commented Mar 19, 2025

Uh oh!

DivyanshIITB commented Mar 20, 2025

Uh oh!

vigyasharma left a comment

Uh oh!

Uh oh!

DivyanshIITB commented Mar 22, 2025

Uh oh!

vigyasharma Mar 22, 2025

Uh oh!

DivyanshIITB Mar 23, 2025

Uh oh!

vigyasharma commented Mar 22, 2025

Uh oh!

DivyanshIITB commented Mar 23, 2025 •

edited

Loading

Uh oh!

Uh oh!

vigyasharma commented Mar 24, 2025

Uh oh!

Uh oh!

Optimize ParallelLeafReader to improve term vector fetching efficienc #14373

Optimize ParallelLeafReader to improve term vector fetching efficienc #14373

Uh oh!

Conversation

DivyanshIITB commented Mar 19, 2025

Uh oh!

DivyanshIITB commented Mar 20, 2025

Uh oh!

vigyasharma left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

DivyanshIITB commented Mar 22, 2025

Uh oh!

vigyasharma Mar 22, 2025

Choose a reason for hiding this comment

Uh oh!

DivyanshIITB Mar 23, 2025

Choose a reason for hiding this comment

Uh oh!

vigyasharma commented Mar 22, 2025

Uh oh!

DivyanshIITB commented Mar 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

vigyasharma commented Mar 24, 2025

Uh oh!

Uh oh!

DivyanshIITB commented Mar 23, 2025 •

edited

Loading