Fixing quantization interval initialization for optimized sq #14374
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
A rather silly bug that we didn't catch due to us testing on well-behaved modern vectors. However, many benchmarking cases and some more "bespoke" feature models (minst, etc.) do not have well distributed components. Consequently, the bug showed up.
Previously, the recall for
minst
was~0.018
. YIKES.Here are some numbers with this bug fix (I include "well behaved" component vectors here to indicate there isn't a negative impact there).
The latency, etc. is always tricky to benchmark. These were ran on my laptop while I was actively working on other things. I would pay most attention to the recall.
Fashion-minst (784 dims)
COHERE v2 (768 dim):
Cohere V3 (1024 dim):
E5-small-v2 (384 dim):
related: #14342
(I am not closing the issue with this PR, I think there is further improvements to be gained by preserving dot-product behavior on these variously distributed vector components).