-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Stop closing index input when loading NRTSuggester #14271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Cloning the index input is not necessary when loading the NRTSUggester. Also, it causes a bug as the clone gets closed right before the suggester instance gets returned, which makes it unusable in the later lookup method which slices the index input while it's already closed. This was not discovered so far due to a testing gap that is fixed by apache#14270
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change looks good.
dictClone.seek(offset); | ||
suggester = NRTSuggester.load(dictClone, fstLoadMode); | ||
} | ||
IndexInput indexInput = dictIn.slice("NRTSuggester", offset, dictIn.length() - offset); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This got me confused wrt the PR title since slices are technically clones. Maybe update the title to say "stop closing the clone" instead of "stop cloning"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, we no longer call clone, that is what I meant :) but I see your point that we call it indirectly now.
Thanks all! |
Cloning the index input is not necessary when loading the NRTSUggester. Also, it causes a bug as the clone gets closed right before the suggester instance gets returned, which makes it unusable in the later lookup method which slices the index input while it's already closed. This was not discovered so far due to a testing gap that is fixed by #14270
Cloning the index input is not necessary when loading the NRTSUggester. Also, it causes a bug as the clone gets closed right before the suggester instance gets returned, which makes it unusable in the later lookup method which slices the index input while it's already closed. This was not discovered so far due to a testing gap that is fixed by #14270
All the existing completion postings format load their FSTs on-heap. It is possible to customize that behaviour by mainintaing a custom postings format that override the fst load mode. TestSuggestField attempted to test all modes, but in practice, it can only test the default load mode because the read path relies on the default constructor called via SPI, that does not allow providing the fst load mode. This commit changes the default value to OFF_HEAP, and addresses the testing gap in TestSuggestField, so that all load modes are tested. This unveiled a bug when loading FST off heap and the index input is a MemorySegmentIndexInput which is fixed by apache#14271.
Closing the index input is not necessary when loading the NRTSuggester. Also, it causes a bug as the clone gets closed right before the suggester instance gets returned, which makes it unusable in the later lookup method which slices the index input while it's already closed. This is addressed by creating a slice, that is not further closed, so that the index input is still usable once the suggester is returned.
This was not discovered so far due to a testing gap that is fixed by #14270.
The bug was introduced by #13524 , stacktrace follows: