|
| 1 | +[FAISS](https://github.com/facebookresearch/faiss) is a library for efficient similarity search and clustering of dense vectors. It is designed to work with large-scale datasets and provides a high-performance search engine for vector data. FAISS is optimized for memory usage and search speed, making it an excellent choice for production environments. |
| 2 | + |
| 3 | +### Usage |
| 4 | + |
| 5 | +```python |
| 6 | +import os |
| 7 | +from mem0 import Memory |
| 8 | + |
| 9 | +os.environ["OPENAI_API_KEY"] = "sk-xx" |
| 10 | + |
| 11 | +config = { |
| 12 | + "vector_store": { |
| 13 | + "provider": "faiss", |
| 14 | + "config": { |
| 15 | + "collection_name": "test", |
| 16 | + "path": "/tmp/faiss_memories", |
| 17 | + "distance_strategy": "euclidean" |
| 18 | + } |
| 19 | + } |
| 20 | +} |
| 21 | + |
| 22 | +m = Memory.from_config(config) |
| 23 | +messages = [ |
| 24 | + {"role": "user", "content": "I'm planning to watch a movie tonight. Any recommendations?"}, |
| 25 | + {"role": "assistant", "content": "How about a thriller movies? They can be quite engaging."}, |
| 26 | + {"role": "user", "content": "I'm not a big fan of thriller movies but I love sci-fi movies."}, |
| 27 | + {"role": "assistant", "content": "Got it! I'll avoid thriller recommendations and suggest sci-fi movies in the future."} |
| 28 | +] |
| 29 | +m.add(messages, user_id="alice", metadata={"category": "movies"}) |
| 30 | +``` |
| 31 | + |
| 32 | +### Installation |
| 33 | + |
| 34 | +To use FAISS in your mem0 project, you need to install the appropriate FAISS package for your environment: |
| 35 | + |
| 36 | +```bash |
| 37 | +# For CPU version |
| 38 | +pip install faiss-cpu |
| 39 | + |
| 40 | +# For GPU version (requires CUDA) |
| 41 | +pip install faiss-gpu |
| 42 | +``` |
| 43 | + |
| 44 | +### Config |
| 45 | + |
| 46 | +Here are the parameters available for configuring FAISS: |
| 47 | + |
| 48 | +| Parameter | Description | Default Value | |
| 49 | +| --- | --- | --- | |
| 50 | +| `collection_name` | The name of the collection | `mem0` | |
| 51 | +| `path` | Path to store FAISS index and metadata | `/tmp/faiss/<collection_name>` | |
| 52 | +| `distance_strategy` | Distance metric strategy to use (options: 'euclidean', 'inner_product', 'cosine') | `euclidean` | |
| 53 | +| `normalize_L2` | Whether to normalize L2 vectors (only applicable for euclidean distance) | `False` | |
| 54 | + |
| 55 | +### Performance Considerations |
| 56 | + |
| 57 | +FAISS offers several advantages for vector search: |
| 58 | + |
| 59 | +1. **Efficiency**: FAISS is optimized for memory usage and speed, making it suitable for large-scale applications. |
| 60 | +2. **Offline Support**: FAISS works entirely locally, with no need for external servers or API calls. |
| 61 | +3. **Storage Options**: Vectors can be stored in-memory for maximum speed or persisted to disk. |
| 62 | +4. **Multiple Index Types**: FAISS supports different index types optimized for various use cases (though mem0 currently uses the basic flat index). |
| 63 | + |
| 64 | +### Distance Strategies |
| 65 | + |
| 66 | +FAISS in mem0 supports three distance strategies: |
| 67 | + |
| 68 | +- **euclidean**: L2 distance, suitable for most embedding models |
| 69 | +- **inner_product**: Dot product similarity, useful for some specialized embeddings |
| 70 | +- **cosine**: Cosine similarity, best for comparing semantic similarity regardless of vector magnitude |
| 71 | + |
| 72 | +When using `cosine` or `inner_product` with normalized vectors, you may want to set `normalize_L2=True` for better results. |
0 commit comments