Skip to content

Commit 40d103b

Browse files
authored
Replace broken footnote with parentheses (#1935)
1 parent 6ffe67d commit 40d103b

File tree

1 file changed

+1
-3
lines changed

1 file changed

+1
-3
lines changed

embedding-quantization.md

+1-3
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@ The table below gives an overview of different models, dimension size, memory re
6565

6666
There are several ways to approach the challenges of scaling embeddings. The most common approach is dimensionality reduction, such as [PCA](https://en.wikipedia.org/wiki/Principal_component_analysis). However, classic dimensionality reduction -- like PCA methods -- [tends to perform poorly when used with embeddings](https://arxiv.org/abs/2205.11498).
6767

68-
In recent news, [Matryoshka Representation Learning](https://arxiv.org/abs/2205.13147) ([blogpost](https://huggingface.co/blog/matryoshka)) (MRL) as used by [OpenAI](https://openai.com/blog/new-embedding-models-and-api-updates) also allows for cheaper embeddings. With MRL, only the first `n` embedding dimensions are used. This approach has already been adopted by some open models like [nomic-ai/nomic-embed-text-v1.5](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5) and [mixedbread-ai/mxbai-embed-2d-large-v1](https://huggingface.co/mixedbread-ai/mxbai-embed-2d-large-v1) [^1].
68+
In recent news, [Matryoshka Representation Learning](https://arxiv.org/abs/2205.13147) ([blogpost](https://huggingface.co/blog/matryoshka)) (MRL) as used by [OpenAI](https://openai.com/blog/new-embedding-models-and-api-updates) also allows for cheaper embeddings. With MRL, only the first `n` embedding dimensions are used. This approach has already been adopted by some open models like [nomic-ai/nomic-embed-text-v1.5](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5) and [mixedbread-ai/mxbai-embed-2d-large-v1](https://huggingface.co/mixedbread-ai/mxbai-embed-2d-large-v1) (For OpenAIs `text-embedding-3-large`, we see a performance retention of 93.1% at 12x compression. For nomic's model, we retain 95.8% of performance at 3x compression and 90% at 6x compression.).
6969

7070
However, there is another new approach to achieve progress on this challenge; it does not entail dimensionality reduction, but rather a reduction in the size of each of the individual values in the embedding: **Quantization**. Our experiments on quantization will show that we can maintain a large amount of performance while significantly speeding up computation and saving on memory, storage, and costs. Let's dive into it!
7171

@@ -381,5 +381,3 @@ This project is possible thanks to our collaboration with [mixedbread.ai](https:
381381
* https://txt.cohere.com/int8-binary-embeddings/
382382
* https://qdrant.tech/documentation/guides/quantization
383383
* https://zilliz.com/learn/scalar-quantization-and-product-quantization
384-
385-
[^1]: For OpenAIs `text-embedding-3-large`, we see a performance retention of 93.1% at 12x compression. For nomic's model, we retain 95.8% of performance at 3x compression and 90% at 6x compression.

0 commit comments

Comments
 (0)