Skip to content

Support modifying segmentInfos.counter in IndexWriter #14417

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

guojialiang92
Copy link
Contributor

@guojialiang92 guojialiang92 commented Mar 27, 2025

Description

This PR aims to address issue 14362. This issue includes a discussion of the benefits of this modification.

Tests

  • simple scenario
    • I added test TestIndexWriter#testAdvanceSegmentInfosCounter, the writer index a bunch of docs, then advance its counter.
  • crash and recovery scenario
    • I added test TestIndexWriter#testAdvanceSegmentCounterInCrashAndRecoveryScenario, the writer index a bunch of docs, then close it, start a new writer on the same index, and advance its counter.
  • Concurrent writing and modification scenario
    • Modified the test ThreadedIndexingAndSearchingTestCase, in the concurrent write thread, increased the logic of randomly modifying the segment counter, and checked after all write threads ended.

Checklist

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have given Lucene maintainers access to contribute to my PR branch. (optional but recommended)
  • I have developed this patch against the main branch.
  • I have run ./gradlew check.
  • I have added tests for my changes.

Signed-off-by: guojialiang <[email protected]>
@guojialiang92
Copy link
Contributor Author

guojialiang92 commented Mar 27, 2025

Hi, @vigyasharma
Thanks for helping with the code review, I have made modifications according to the suggestions.

@vigyasharma
Copy link
Contributor

I think we can add a couple more tests to make it robust.

  1. Some tests around concurrency – index with multiple threads, then advance the counter in one of the threads, and validate behavior. You can look at ThreadedIndexingAndSearchingTestCase and its derived tests for motivation.
  2. A test for the crash-recovery scenario, which I suppose it the primary use case. We could make the writer index a bunch of docs, then kill it, start a new writer on the same index, and advance its counter.

@vigyasharma
Copy link
Contributor

Also, IIUC IndexWriter#advanceSegmentInfosVersion() was added to handle similar scenarios for NRT replication (Lucene's native segment replication implementation). I'm curious why we didn't run into the need to advance SegmentInfos#counter at that time. Do you remember, @mikemccand (I know it's been a while! (: )?

Signed-off-by: guojialiang <[email protected]>
Signed-off-by: guojialiang <[email protected]>
@guojialiang92
Copy link
Contributor Author

Thanks, @vigyasharma

I think we can add a couple more tests to make it robust.

  1. Some tests around concurrency – index with multiple threads, then advance the counter in one of the threads, and validate behavior. You can look at ThreadedIndexingAndSearchingTestCase and its derived tests for motivation.
  2. A test for the crash-recovery scenario, which I suppose it the primary use case. We could make the writer index a bunch of docs, then kill it, start a new writer on the same index, and advance its counter.

I have added the following tests according to the suggestions:

  1. For crash-recovery scenario, I have added TestIndexWriter#testAdvanceSegmentCounterInCrashAndRecoveryScenario.
  2. For scenarios with multi-threaded concurrent writes, I have modified the test ThreadedIndexingAndSearchingTestCase, in the concurrent write thread, increased the logic of randomly modifying the segment counter, and checked after all write threads ended.

@guojialiang92
Copy link
Contributor Author

Thanks, @vigyasharma
I also looked at Lucene's native segment replication, just sharing my personal opinion.

Also, IIUC IndexWriter#advanceSegmentInfosVersion() was added to handle similar scenarios for NRT replication (Lucene's native segment replication implementation). I'm curious why we didn't run into the need to advance SegmentInfos#counter at that time. Do you remember, @mikemccand (I know it's been a while! (: )?

In the code comments of Lucene's native segment replication, the risk of file conflicts is also mentioned, but no additional processing is done. From a robustness perspective, perhaps control should also be carried out. The relevant code is as follows:
ReplicaNode#fileIsIdentical (Segment name was reused! This is rare but possible and otherwise devastating)

  private boolean fileIsIdentical(String fileName, FileMetaData srcMetaData) throws IOException {

    FileMetaData destMetaData = readLocalFileMetaData(fileName);
    if (destMetaData == null) {
      // Something went wrong in reading the file (it's corrupt, truncated, does not exist, etc.):
      return false;
    }

    if (Arrays.equals(destMetaData.header(), srcMetaData.header()) == false
        || Arrays.equals(destMetaData.footer(), srcMetaData.footer()) == false) {
      // Segment name was reused!  This is rare but possible and otherwise devastating:
      if (isVerboseFiles()) {
        message("file " + fileName + ": will copy [header/footer is different]");
      }
      return false;
    } else {
      return true;
    }
  }

@guojialiang92
Copy link
Contributor Author

Hi, @vigyasharma
Could you please help me take a look again? Do you have any other suggestions?

Copy link
Contributor

@vigyasharma vigyasharma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the delay, these changes look great, @guojialiang92. I love the new tests, and how you're testing multi-threaded scenarios.

I have some minor comments. Please also add a CHANGES.txt entry as well and I'll merge this.

@@ -189,6 +190,19 @@ public void run() {
addedField = null;
}

// Maybe advance segment counter
if (random().nextBoolean()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running this with 50% probability might slow down the tests too much with the synchronization this needs. Let's run it with a lower probability, something like 1 in 7 – if (random().nextInt(7) == 5) { ... }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I have modified the test and CHANGES.txt.

@guojialiang92
Copy link
Contributor Author

Thanks, @vigyasharma
I have made the modifications as suggested, please reivew the code again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants