Skip to content

Refactor main top-n bulk scorers to evaluate hits in a more term-at-a-time fashion. #14701

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 25, 2025

Conversation

jpountz
Copy link
Contributor

@jpountz jpountz commented May 22, 2025

MaxScoreBulkScorer and BlockMaxConjunctionBulkScorer currently evaluate hits in a doc-at-a-time (DAAT) fashion, meaning that they they look at all their clauses to find the next doc and so forth until all docs from the window are evaluated. This changes evaluation to run in a more term-at-a-time fashion (TAAT) within scoring windows, meaning that each clause is fully evaluated within the window before moving on to the next clause.

Note that this isn't completely new, BooleanScorer has been doing this to exhaustively evaluate disjunctive queries, by loading their matches into a bit set, one clause at a time. Also note that this is a bit different from traditional TAAT as this is scoped to small-ish windows of doc IDs, not the entire doc ID space.

This in-turn allows these scorers to take advantage of the new Scorer#nextDocsAndScores API, and provides a good speedup. A downside is that we may need to perform more memory copying in some cases, and evaluate a bit more documents, but the change still looks like a win in general.

…-time fashion.

`MaxScoreBulkScorer` and `BlockMaxConjunctionBulkScorer` currently evaluate
hits in a doc-at-a-time (DAAT) fashion, meaning that they they look at all
their clauses to find the next doc and so forth until all docs from the window
are evaluated. This changes evaluation to run in a more term-at-a-time fashion
(TAAT) within scoring windows, meaning that each clause is fully evaluated
within the window before moving on to the next clause.

Note that this isn't completely new, `BooleanScorer` has been doing this to
exhaustively evaluate disjunctive queries, by loading their matches into a bit
set, one clause at a time. Also note that this is a bit different from
traditional TAAT as this is scoped to small-ish windows of doc IDs, not the
entire doc ID space.

This in-turn allows these scorers to take advantage of the new
`Scorer#nextDocsAndScores` API, and provides a good speedup. A downside is that
we may need to perform more memory copying in some cases, and evaluate a bit
more documents, but the change still looks like a win in general.
Copy link

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you will stop receiving this reminder on future updates to the PR.

@jpountz
Copy link
Contributor Author

jpountz commented May 22, 2025

Filtered queries get a slowdown, but some important queries get a big speedup:

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
             FilteredAndHighHigh       65.66      (1.9%)       59.10      (4.6%)  -10.0% ( -16% -   -3%) 0.000
            FilteredAndStopWords       45.10      (2.4%)       41.04      (3.9%)   -9.0% ( -14% -   -2%) 0.000
               FilteredAnd3Terms      181.24      (1.7%)      165.29      (3.2%)   -8.8% ( -13% -   -3%) 0.000
     FilteredAnd2Terms2StopWords      178.47      (1.3%)      162.78      (2.9%)   -8.8% ( -12% -   -4%) 0.000
                          Fuzzy1      102.41      (2.9%)       98.67      (3.2%)   -3.6% (  -9% -    2%) 0.000
             FilteredOrStopWords       46.85      (1.6%)       45.27      (2.5%)   -3.4% (  -7% -    0%) 0.000
              FilteredOrHighHigh       68.45      (0.9%)       66.49      (1.9%)   -2.9% (  -5% -    0%) 0.000
                          Fuzzy2       85.94      (2.6%)       83.58      (2.8%)   -2.7% (  -7% -    2%) 0.002
                      OrHighRare      291.53      (3.3%)      283.82      (6.1%)   -2.6% ( -11% -    6%) 0.096
              FilteredAndHighMed      126.38      (2.6%)      123.62      (5.5%)   -2.2% ( -10% -    6%) 0.119
                      DismaxTerm      486.16      (2.2%)      478.90      (3.2%)   -1.5% (  -6% -    3%) 0.089
               FilteredOrHighMed      153.91      (0.8%)      151.71      (1.3%)   -1.4% (  -3% -    0%) 0.000
      FilteredOr2Terms2StopWords      148.64      (1.1%)      146.57      (1.5%)   -1.4% (  -3% -    1%) 0.001
             CombinedAndHighHigh       11.47      (2.6%)       11.33      (2.3%)   -1.2% (  -5% -    3%) 0.132
                FilteredOr3Terms      167.24      (1.1%)      165.61      (1.6%)   -1.0% (  -3% -    1%) 0.030
                        Wildcard       94.57      (2.8%)       93.67      (2.7%)   -1.0% (  -6% -    4%) 0.281
              CombinedOrHighHigh       18.94      (1.8%)       18.80      (2.1%)   -0.7% (  -4% -    3%) 0.243
                            Term      439.45      (3.6%)      436.34      (4.5%)   -0.7% (  -8% -    7%) 0.595
                      TermDTSort      389.67      (4.7%)      386.92      (4.8%)   -0.7% (  -9% -    9%) 0.647
                    FilteredTerm      159.78      (1.6%)      158.69      (2.3%)   -0.7% (  -4% -    3%) 0.282
                 AndHighOrMedMed       46.65      (1.1%)       46.40      (1.8%)   -0.5% (  -3% -    2%) 0.267
                        PKLookup      320.16      (4.3%)      318.93      (4.7%)   -0.4% (  -8% -    8%) 0.791
                 FilteredPrefix3      154.48      (2.3%)      154.23      (2.6%)   -0.2% (  -4% -    4%) 0.840
                          Phrase       14.34      (2.2%)       14.32      (2.3%)   -0.1% (  -4% -    4%) 0.858
                          IntNRQ      309.51      (0.8%)      309.13      (0.9%)   -0.1% (  -1% -    1%) 0.641
                         Prefix3      163.72      (2.3%)      163.54      (2.7%)   -0.1% (  -4% -    4%) 0.893
                 CountAndHighMed      312.21      (1.2%)      312.08      (1.6%)   -0.0% (  -2% -    2%) 0.929
                IntervalsOrdered        2.29      (2.9%)        2.29      (2.5%)    0.1% (  -5% -    5%) 0.919
                  FilteredIntNRQ      301.55      (1.0%)      301.84      (0.9%)    0.1% (  -1% -    2%) 0.760
             CountFilteredOrMany       27.46      (1.0%)       27.50      (1.3%)    0.1% (  -2% -    2%) 0.754
          CountFilteredOrHighMed      148.19      (0.7%)      148.39      (0.6%)    0.1% (  -1% -    1%) 0.509
                     CountOrMany       30.46      (1.5%)       30.52      (1.5%)    0.2% (  -2% -    3%) 0.712
                       CountTerm     8687.88      (2.8%)     8708.11      (4.5%)    0.2% (  -6% -    7%) 0.850
         CountFilteredOrHighHigh      136.50      (0.7%)      136.90      (0.6%)    0.3% (  -1% -    1%) 0.178
                CountAndHighHigh      359.19      (1.7%)      360.43      (1.6%)    0.3% (  -2% -    3%) 0.520
                  CountOrHighMed      366.72      (1.5%)      368.05      (1.8%)    0.4% (  -2% -    3%) 0.486
                    CombinedTerm       30.67      (3.5%)       30.78      (4.0%)    0.4% (  -6% -    8%) 0.751
                 CountOrHighHigh      346.98      (1.1%)      348.48      (1.5%)    0.4% (  -2% -    3%) 0.304
                  FilteredPhrase       32.93      (1.8%)       33.07      (1.5%)    0.4% (  -2% -    3%) 0.413
             CountFilteredPhrase       25.29      (3.2%)       25.50      (2.5%)    0.8% (  -4% -    6%) 0.377
                   TermTitleSort       85.82     (10.1%)       86.55      (7.5%)    0.9% ( -15% -   20%) 0.767
                   TermMonthSort     3181.64      (2.9%)     3213.86      (2.2%)    1.0% (  -3% -    6%) 0.227
                  FilteredOrMany       16.50      (2.7%)       16.67      (2.5%)    1.0% (  -4% -    6%) 0.221
                     CountPhrase        4.15      (3.2%)        4.20      (2.0%)    1.3% (  -3% -    6%) 0.137
               TermDayOfYearSort      283.82      (3.8%)      287.84      (1.8%)    1.4% (  -4% -    7%) 0.146
               CombinedOrHighMed       72.56      (2.4%)       73.69      (1.8%)    1.6% (  -2% -    5%) 0.024
                          OrMany       20.45      (5.3%)       21.05      (6.0%)    2.9% (  -8% -   15%) 0.114
              CombinedAndHighMed       38.55      (2.3%)       39.92      (2.2%)    3.6% (   0% -    8%) 0.000
                DismaxOrHighHigh      118.14      (2.1%)      122.80      (4.3%)    3.9% (  -2% -   10%) 0.000
                 DismaxOrHighMed      173.28      (2.3%)      183.27      (2.2%)    5.8% (   1% -   10%) 0.000
                        Or3Terms      165.74      (5.5%)      197.36      (7.9%)   19.1% (   5% -   34%) 0.000
                       And3Terms      172.23      (4.2%)      205.61      (7.9%)   19.4% (   7% -   32%) 0.000
              Or2Terms2StopWords      159.77      (5.6%)      191.42      (6.9%)   19.8% (   6% -   34%) 0.000
             And2Terms2StopWords      161.65      (4.1%)      194.38      (7.1%)   20.2% (   8% -   32%) 0.000
                AndMedOrHighHigh       65.43      (2.8%)       79.68      (2.7%)   21.8% (  15% -   28%) 0.000
                    AndStopWords       29.75      (6.7%)       36.45     (12.7%)   22.5% (   2% -   44%) 0.000
                     OrStopWords       32.42      (9.6%)       39.95     (13.0%)   23.2% (   0% -   50%) 0.000
                       OrHighMed      192.11      (4.0%)      244.47      (2.7%)   27.3% (  19% -   35%) 0.000
                      OrHighHigh       51.75      (4.9%)       66.11      (9.2%)   27.8% (  13% -   44%) 0.000
                     AndHighHigh       43.14      (1.5%)       56.30     (11.8%)   30.5% (  16% -   44%) 0.000
                      AndHighMed      135.13      (0.9%)      183.67      (4.2%)   35.9% (  30% -   41%) 0.000

Copy link
Contributor

@gf2121 gf2121 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it, simpler and faster!

@@ -55,9 +55,6 @@ final class BlockMaxConjunctionBulkScorer extends BulkScorer {
this.iterators =
Arrays.stream(this.scorers).map(Scorer::iterator).toArray(DocIdSetIterator[]::new);
lead1 = ScorerUtil.likelyImpactsEnum(iterators[0]);
lead2 = ScorerUtil.likelyImpactsEnum(iterators[1]);
scorer1 = this.scorables[0];
scorer2 = this.scorables[1];
this.sumOfOtherClauses = new double[this.scorers.length];
for (int i = 0; i < sumOfOtherClauses.length; i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not introduced in this PR, but maybe change this to a Arrays.fill by the way :)

@@ -38,11 +37,12 @@ final class BlockMaxConjunctionBulkScorer extends BulkScorer {
private final Scorer[] scorers;
private final Scorable[] scorables;
private final DocIdSetIterator[] iterators;
private final DocIdSetIterator lead1, lead2;
private final Scorable scorer1, scorer2;
private final DocIdSetIterator lead1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just name it lead since we do not have lead2 now.


doc = lead1.nextDoc();
int maxOtherDoc = -1;
for (int i = 0; i < iterators.length; ++i) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not important but since we are naming maxOtherDoc let's start i from 1?

if (scorer.doc == doc) {
score += scorer.scorable.score();
}
ScorerUtil.filterCompetitiveHits(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we skip this if minCompetitiveScore == 0f, like other place?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be needed as this evaluates non-essential clauses, and there can only be non-essential clauses if minCompetitiveScore > 0. I'll add an assertion to make this clearer.

if (curDoc < targetDoc) {
curDoc = iterator.advance(targetDoc);
}
if (curDoc == targetDoc) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, could VectorUtil#findNextGTE help here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wondered the same. :) Let's keep it this way (slightly simpler) and look into whether this can be made faster with VectorUtil in a follow-up?

@@ -118,98 +115,36 @@ private void scoreWindow(
return;
}

Scorable scorer1 = this.scorer1;
if (scorers[0].getMaxScore(max - 1) == 0f) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering why the FilteredAnd tasks, which use BlockMaxConjunctionBulkScorer, are getting more slow down than other tasks. Could the removal of this special optimization be the reason? Do we intend to remove this for simplification?

Copy link
Contributor Author

@jpountz jpountz May 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be a contributing factor indeed. I suspect that another factor is that filtered tasks use the default impl of nextDocsAndScores which adds some overhead compared to what we were doing before since it first needs to copy matching docs to an array before doing work on this array. I was keen to looking into the performance of filtered tasks in a follow-up, I worry it's not going to be easy to recover performance.

Copy link
Contributor

@gf2121 gf2121 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@jpountz jpountz merged commit b5e79a3 into apache:main May 25, 2025
7 checks passed
@jpountz jpountz deleted the clause_at_a_time branch May 25, 2025 19:18
jpountz added a commit that referenced this pull request May 25, 2025
…-time fashion. (#14701)

`MaxScoreBulkScorer` and `BlockMaxConjunctionBulkScorer` currently evaluate
hits in a doc-at-a-time (DAAT) fashion, meaning that they they look at all
their clauses to find the next doc and so forth until all docs from the window
are evaluated. This changes evaluation to run in a more term-at-a-time fashion
(TAAT) within scoring windows, meaning that each clause is fully evaluated
within the window before moving on to the next clause.

Note that this isn't completely new, `BooleanScorer` has been doing this to
exhaustively evaluate disjunctive queries, by loading their matches into a bit
set, one clause at a time. Also note that this is a bit different from
traditional TAAT as this is scoped to small-ish windows of doc IDs, not the
entire doc ID space.

This in-turn allows these scorers to take advantage of the new
`Scorer#nextDocsAndScores` API, and provides a good speedup. A downside is that
we may need to perform more memory copying in some cases, and evaluate a bit
more documents, but the change still looks like a win in general.
weizijun added a commit to weizijun/lucene that referenced this pull request May 27, 2025
* main: (32 commits)
  update os.makedirs with pathlib mkdir (apache#14710)
  Optimize AbstractKnnVectorQuery#createBitSet with intoBitset (apache#14674)
  Implement #docIDRunEnd() on PostingsEnum. (apache#14693)
  Speed up TermQuery (apache#14709)
  Refactor main top-n bulk scorers to evaluate hits in a more term-at-a-time fashion. (apache#14701)
  Fix WindowsFS test failure seen on Policeman Jenkins (apache#14706)
  Use a temporary repository location to download certain ecj versions ("drops") (apache#14703)
  Add assumption to ignore occasional test failures due to disconnected graphs (apache#14696)
  Return MatchNoDocsQuery when IndexOrDocValuesQuery::rewrite does not match (apache#14700)
  Minor access modifier adjustment to a couple of lucene90 backward compat types (apache#14695)
  Speed up exhaustive evaluation. (apache#14679)
  Specify and test that IOContext is immutable (apache#14686)
  deps(java): bump org.gradle.toolchains.foojay-resolver-convention (apache#14691)
  deps(java): bump org.eclipse.jgit:org.eclipse.jgit (apache#14692)
  Clean up how the test framework creates asserting scorables. (apache#14452)
  Make competitive iterators more robust. (apache#14532)
  Remove DISIDocIdStream. (apache#14550)
  Implement AssertingPostingsEnum#intoBitSet. (apache#14675)
  Fix patience knn queries to work with seeded knn queries (apache#14688)
  Added toString() method to BytesRefBuilder (apache#14676)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants