You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval (#1934)
* Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval
* Mark Aamir as guest
Co-authored-by: Pedro Cuenca <[email protected]>
* Update memory/disk usage with 41M embeddings
* #Demo -> #demo
* Update Aamir's username
* Apply suggestions from review
Thanks a ton, Pedro!
Co-authored-by: Pedro Cuenca <[email protected]>
* Replace most footnotes with links
* Remove all comments
* add comment about 92.5% without rescoring in binary quant.
* Separate the code blocks for binary quantization
* Apply suggestions from review
Co-authored-by: Pedro Cuenca <[email protected]>
* Fix incomplete links
* Add clarification about the difference between scalar and binary
* Apply suggestions from review
Thanks Omar!
Co-authored-by: Omar Sanseviero <[email protected]>
* Mention that embedding quantization is not like model quantization
---------
Co-authored-by: Pedro Cuenca <[email protected]>
Co-authored-by: Omar Sanseviero <[email protected]>
0 commit comments