Skip to content

Commit 7f4526f

Browse files
Add ranker to the fastembed (#287)
* Add ranker to the fastembed.md * Update fastembed.md --------- Co-authored-by: Stefano Fiorucci <[email protected]>
1 parent 3bcefbd commit 7f4526f

File tree

1 file changed

+45
-5
lines changed

1 file changed

+45
-5
lines changed

integrations/fastembed.md

+45-5
Original file line numberDiff line numberDiff line change
@@ -27,11 +27,12 @@ toc: true
2727
- [License](#license)
2828

2929
## Overview
30-
[FastEmbed](https://qdrant.github.io/fastembed/) is a lightweight, fast, Python library built for embedding generation.
30+
[FastEmbed](https://qdrant.github.io/fastembed/) is a lightweight, fast, Python library built for embedding generation and document ranking.
3131

3232
- Light and fast: quantized model weights; ONNX Runtime for inference via Optimum.
3333
- Performant embedding models: list of [supported models](https://qdrant.github.io/fastembed/examples/Supported_Models/) - including multilingual models.
3434
- Support for sparse embedding models.
35+
- Good integration with Qdrant document store and retrievers.
3536

3637

3738
## Installation
@@ -43,10 +44,13 @@ pip install fastembed-haystack
4344
## Usage
4445
### Components
4546
The `fastembed-haystack` integrations provides the following components:
46-
- `FastembedTextEmbedder`: creates a dense embedding for text (used in query/RAG pipelines).
47-
- `FastembedDocumentEmbedder`: enriches documents with dense embeddings (used in indexing pipelines).
48-
- `FastembedSparseTextEmbedder`: creates a sparse embedding for text (used in query/RAG pipelines).
49-
- `FastembedSparseDocumentEmbedder`: enriches documents with sparse embeddings (used in indexing pipelines).
47+
- Embedders:
48+
- `FastembedTextEmbedder`: creates a dense embedding for text (used in query/RAG pipelines).
49+
- `FastembedDocumentEmbedder`: enriches documents with dense embeddings (used in indexing pipelines).
50+
- `FastembedSparseTextEmbedder`: creates a sparse embedding for text (used in query/RAG pipelines).
51+
- `FastembedSparseDocumentEmbedder`: enriches documents with sparse embeddings (used in indexing pipelines).
52+
- Ranker:
53+
- `FastembedRanker`: ranks documents based on a query (used in query/RAG pipelines after the retrieval).
5054

5155
### Example with dense embeddings
5256

@@ -125,6 +129,42 @@ result = query_pipeline.run({"sparse_text_embedder": {"text": query}})
125129

126130
For a more detailed example, see this [notebook](https://github.com/deepset-ai/haystack-cookbook/blob/main/notebooks/sparse_embedding_retrieval.ipynb).
127131

132+
### Example with ranker
133+
134+
```python
135+
from haystack import Document, Pipeline
136+
from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever
137+
from haystack.document_stores.in_memory import InMemoryDocumentStore
138+
from haystack_integrations.components.embedders.fastembed import FastembedDocumentEmbedder, FastembedTextEmbedder
139+
from haystack_integrations.components.rankers.fastembed import FastembedRanker
140+
141+
document_store = InMemoryDocumentStore(embedding_similarity_function="cosine")
142+
143+
query = "Who supports fastembed?"
144+
145+
documents = [
146+
Document(content="My name is Wolfgang and I live in Berlin"),
147+
Document(content="I saw a black horse running"),
148+
Document(content="Germany has many big cities"),
149+
Document(content="fastembed is supported by and maintained by Qdrant."),
150+
]
151+
152+
document_embedder = FastembedDocumentEmbedder()
153+
document_embedder.warm_up()
154+
documents_with_embeddings = document_embedder.run(documents)["documents"]
155+
document_store.write_documents(documents_with_embeddings)
156+
157+
query_pipeline = Pipeline()
158+
query_pipeline.add_component("text_embedder", FastembedTextEmbedder())
159+
query_pipeline.add_component("retriever", InMemoryEmbeddingRetriever(document_store=document_store))
160+
query_pipeline.add_component("ranker", FastembedRanker(top_k=2))
161+
query_pipeline.connect("text_embedder.embedding", "retriever.query_embedding")
162+
query_pipeline.connect("retriever.documents", "ranker.documents")
163+
164+
165+
result = query_pipeline.run({"text_embedder": {"text": query}, "ranker": { "query" : query }})
166+
```
167+
128168
### License
129169

130170
`fastembed-haystack` is distributed under the terms of the [Apache-2.0](https://spdx.org/licenses/Apache-2.0.html) license.

0 commit comments

Comments
 (0)