Skip to content

Fix performance regression in ApproximatePointRangeQuery with Lucene 10.2.1 change #18358

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
May 30, 2025
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
- Fix 'system call filter not installed' caused when network.host: 0.0.0.0 ([#18309](https://github.com/opensearch-project/OpenSearch/pull/18309))
- Fix MatrixStatsAggregator reuse when mode parameter changes ([#18242](https://github.com/opensearch-project/OpenSearch/issues/18242))
- Replace the deprecated construction method of TopScoreDocCollectorManager with the new method ([#18395](https://github.com/opensearch-project/OpenSearch/pull/18395))
- Fixed Approximate Framework regression with Lucene 10.2.1 by updating `intersectRight` BKD walk and `IntRef` visit method ([#18358](https://github.com/opensearch-project/OpenSearch/issues/18358

### Security

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -162,10 +162,6 @@
@Override
public void visit(int docID) {
// it is possible that size < 1024 and docCount < size but we will continue to count through all the 1024 docs
// and collect less, but it won't hurt performance
if (docCount[0] >= size) {
return;
}
adder.add(docID);
docCount[0]++;
}
Expand All @@ -177,9 +173,8 @@

@Override
public void visit(IntsRef ref) {
for (int i = 0; i < ref.length; i++) {
adder.add(ref.ints[ref.offset + i]);
}
adder.add(ref);
docCount[0] += ref.length;

Check warning on line 177 in server/src/main/java/org/opensearch/search/approximate/ApproximatePointRangeQuery.java

View check run for this annotation

Codecov / codecov/patch

server/src/main/java/org/opensearch/search/approximate/ApproximatePointRangeQuery.java#L176-L177

Added lines #L176 - L177 were not covered by tests
}

@Override
Expand Down Expand Up @@ -248,10 +243,10 @@
// custom intersect visitor to walk the left of the tree
public void intersectLeft(PointValues.IntersectVisitor visitor, PointValues.PointTree pointTree, long[] docCount)
throws IOException {
PointValues.Relation r = visitor.compare(pointTree.getMinPackedValue(), pointTree.getMaxPackedValue());
if (docCount[0] >= size) {
return;
}
PointValues.Relation r = visitor.compare(pointTree.getMinPackedValue(), pointTree.getMaxPackedValue());
switch (r) {
case CELL_OUTSIDE_QUERY:
// This cell is fully outside the query shape: stop recursing
Expand Down Expand Up @@ -293,65 +288,49 @@
}
}

// custom intersect visitor to walk the right of tree
// custom intersect visitor to walk the right of tree (from rightmost leaf going left)
public void intersectRight(PointValues.IntersectVisitor visitor, PointValues.PointTree pointTree, long[] docCount)
throws IOException {
PointValues.Relation r = visitor.compare(pointTree.getMinPackedValue(), pointTree.getMaxPackedValue());
if (docCount[0] >= size) {
return;
}
PointValues.Relation r = visitor.compare(pointTree.getMinPackedValue(), pointTree.getMaxPackedValue());
switch (r) {
case CELL_OUTSIDE_QUERY:
// This cell is fully outside the query shape: stop recursing
break;

case CELL_INSIDE_QUERY:
// If the cell is fully inside, we keep moving right as long as the point tree size is over our size requirement
if (pointTree.size() > size && docCount[0] < size && moveRight(pointTree)) {
intersectRight(visitor, pointTree, docCount);
pointTree.moveToParent();
}
// if point tree size is no longer over, we have to go back one level where it still was over and the intersect left
else if (pointTree.size() <= size && docCount[0] < size) {
pointTree.moveToParent();
intersectLeft(visitor, pointTree, docCount);
}
// if we've reached leaf, it means out size is under the size of the leaf, we can just collect all docIDs
else {
// Leaf node; scan and filter all points in this block:
if (docCount[0] < size) {
pointTree.visitDocIDs(visitor);
}
}
break;
case CELL_CROSSES_QUERY:
// If the cell is fully inside, we keep moving right as long as the point tree size is over our size requirement
if (pointTree.size() > size && docCount[0] < size && moveRight(pointTree)) {
intersectRight(visitor, pointTree, docCount);
pointTree.moveToParent();
}
// if point tree size is no longer over, we have to go back one level where it still was over and the intersect left
else if (pointTree.size() <= size && docCount[0] < size) {
if (pointTree.moveToChild() && docCount[0] < size) {
PointValues.PointTree leftChild = pointTree.clone();
// BKD is binary today, so one moveToSibling() is enough to land on the right child.
// If PointTree ever becomes n-ary, update the traversal below to visit all siblings or re-enable a full loop.
if (pointTree.moveToSibling()) {
// We have two children - visit right first
intersectRight(visitor, pointTree, docCount);
// Then visit left if we still need more docs
if (docCount[0] < size) {
intersectRight(visitor, leftChild, docCount);

Check warning on line 310 in server/src/main/java/org/opensearch/search/approximate/ApproximatePointRangeQuery.java

View check run for this annotation

Codecov / codecov/patch

server/src/main/java/org/opensearch/search/approximate/ApproximatePointRangeQuery.java#L310

Added line #L310 was not covered by tests
}
} else {
// Only one child - visit it
intersectRight(visitor, leftChild, docCount);

Check warning on line 314 in server/src/main/java/org/opensearch/search/approximate/ApproximatePointRangeQuery.java

View check run for this annotation

Codecov / codecov/patch

server/src/main/java/org/opensearch/search/approximate/ApproximatePointRangeQuery.java#L314

Added line #L314 was not covered by tests
}
pointTree.moveToParent();
intersectLeft(visitor, pointTree, docCount);
}
// if we've reached leaf, it means out size is under the size of the leaf, we can just collect all doc values
else {
// Leaf node; scan and filter all points in this block:
} else {
if (docCount[0] < size) {
pointTree.visitDocValues(visitor);
if (r == PointValues.Relation.CELL_INSIDE_QUERY) {
pointTree.visitDocIDs(visitor);
} else {
pointTree.visitDocValues(visitor);
}
}
}
break;
case CELL_OUTSIDE_QUERY:
break;

Check warning on line 328 in server/src/main/java/org/opensearch/search/approximate/ApproximatePointRangeQuery.java

View check run for this annotation

Codecov / codecov/patch

server/src/main/java/org/opensearch/search/approximate/ApproximatePointRangeQuery.java#L328

Added line #L328 was not covered by tests
default:
throw new IllegalArgumentException("Unreachable code");
}
}

public boolean moveRight(PointValues.PointTree pointTree) throws IOException {
return pointTree.moveToChild() && pointTree.moveToSibling();
}

@Override
public ScorerSupplier scorerSupplier(LeafReaderContext context) throws IOException {
LeafReader reader = context.reader();
Expand Down
Loading
Loading