Skip to content

Commit 6ffe67d

Browse files
tomaarsenpcuencaosanseviero
authored
Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval (#1934)
* Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval * Mark Aamir as guest Co-authored-by: Pedro Cuenca <[email protected]> * Update memory/disk usage with 41M embeddings * #Demo -> #demo * Update Aamir's username * Apply suggestions from review Thanks a ton, Pedro! Co-authored-by: Pedro Cuenca <[email protected]> * Replace most footnotes with links * Remove all comments * add comment about 92.5% without rescoring in binary quant. * Separate the code blocks for binary quantization * Apply suggestions from review Co-authored-by: Pedro Cuenca <[email protected]> * Fix incomplete links * Add clarification about the difference between scalar and binary * Apply suggestions from review Thanks Omar! Co-authored-by: Omar Sanseviero <[email protected]> * Mention that embedding quantization is not like model quantization --------- Co-authored-by: Pedro Cuenca <[email protected]> Co-authored-by: Omar Sanseviero <[email protected]>
1 parent 416435f commit 6ffe67d

File tree

3 files changed

+399
-1
lines changed

3 files changed

+399
-1
lines changed

_blog.yml

+14-1
Original file line numberDiff line numberDiff line change
@@ -3707,4 +3707,17 @@
37073707
tags:
37083708
- leaderboard
37093709
- arena
3710-
- collaboration
3710+
- collaboration
3711+
3712+
- local: embedding-quantization
3713+
title: "Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval"
3714+
author: aamirshakir
3715+
guest: true
3716+
thumbnail: /blog/assets/embedding-quantization/thumbnail.png
3717+
date: Mar 22, 2024
3718+
tags:
3719+
- nlp
3720+
- community
3721+
- guide
3722+
- collaboration
3723+
- research
347 KB
Loading

0 commit comments

Comments
 (0)