A lightweight, embedded vector database focused on simplicity and speed. TinyVecDB provides vector storage and similarity search without the complexity of client-server architectures.
- Embedded: No server setup or maintenance required
- Simple API: Get started with just a few lines of code
- Cross-platform: Works seamlessly on Windows, macOS, and Linux
- Lightweight: Minimal memory footprint and resource usage
- Offline-capable: Great for desktop and mobile applications
TinyVecDB is optimized for performance:
- 4-10x faster query times compared to alternatives
- Minimal memory footprint during operations
- Efficient metadata filtering with vector search
Check out our detailed benchmarks:
npm install tinyvecdb
pip install tinyvecdb
import {
TinyVecClient,
type TinyVecInsertion,
type TinyVecSearchOptions,
} from "tinyvecdb";
async function example() {
// Connect to database (will create the file if it doesn't exist)
const client = TinyVecClient.connect("./vectors.db", { dimensions: 512 });
// Create sample vectors
const insertions: TinyVecInsertion[] = [];
for (let i = 0; i < 50; i++) {
// Using Float32Array (more efficient)
const vec = new Float32Array(512);
for (let j = 0; j < 512; j++) {
vec[j] = Math.random();
}
// Or using standard JavaScript arrays
// const vec = Array.from({ length: 512 }, () => Math.random());
insertions.push({
vector: vec,
metadata: { name: `item-${i}`, category: "example" },
});
}
// Insert vectors
const inserted = await client.insert(insertions);
console.log("Inserted:", inserted);
// Create a query vector
const queryVector = new Float32Array(512);
for (let i = 0; i < 512; i++) {
queryVector[i] = Math.random();
}
// Search for similar vectors (without filtering)
const results = await client.search(queryVector, 5);
// Example results:
/*
[
{
id: 8,
similarity: 0.801587700843811,
metadata: { name: 'item-8', category: 'example' }
},
{
id: 16,
similarity: 0.7834401726722717,
metadata: { name: 'item-16', category: 'example' }
},
{
id: 5,
similarity: 0.7815409898757935,
metadata: { name: 'item-5', category: 'example' }
},
...
]
*/
// Search with filtering
const searchOptions: TinyVecSearchOptions = {
filter: { category: { $eq: "example" } },
};
const filteredResults = await client.search(queryVector, 5, searchOptions);
// Delete items by ID
const deleteByIdResult = await client.deleteByIds([1, 2, 3]);
console.log("Delete by ID result:", deleteByIdResult);
// Example output: Delete by ID result: { deletedCount: 3, success: true }
// Delete by metadata filter
const deleteByFilterResult = await client.deleteByFilter({
filter: { category: { $eq: "example" } },
});
console.log("Delete by filter result:", deleteByFilterResult);
// Example output: Delete by filter result: { deletedCount: 47, success: true }
// Get database statistics
const stats = await client.getIndexStats();
console.log(
`Database has ${stats.vectors} vectors with ${stats.dimensions} dimensions`
);
}
import asyncio
import numpy as np
import tinyvec
async def example():
# Connect to database (will create the file if it doesn't exist)
client = tinyvec.TinyVecClient()
config = tinyvec.ClientConfig(dimensions=512)
client.connect("./vectors.db", config)
# Create sample vectors
insertions = []
for i in range(50):
# Using NumPy (more efficient)
vec = np.random.rand(512).astype(np.float32)
vec = vec / np.linalg.norm(vec) # Normalize the vector
# Or using standard Python lists
# vec = [random.random() for _ in range(512)]
insertions.append(tinyvec.Insertion(
vector=vec,
metadata={"name": f"item-{i}", "category": "example"}
))
# Insert vectors
inserted = await client.insert(insertions)
print("Inserted:", inserted)
# Search for similar vectors (without filtering)
query_vec = np.random.rand(512).astype(np.float32)
results = await client.search(query_vec, 5)
# Example results:
# [SearchResult(similarity=0.801587700843811, id=8, metadata={'category': 'example', 'name': 'item-8'}),
# SearchResult(similarity=0.7834401726722717, id=16, metadata={'category': 'example', 'name': 'item-16'}),
# SearchResult(similarity=0.7815409898757935, id=5, metadata={'category': 'example', 'name': 'item-5'})]
# Search with filtering
search_options = tinyvec.SearchOptions(
filter={"category": {"$eq": "example"}}
)
filtered_results = await client.search(query_vec, 5, search_options)
# Delete items by ID
delete_result = await client.delete_by_ids([1, 2, 3])
print(f"Deleted {delete_result.deleted_count} vectors. Success: {delete_result.success}")
# Delete by metadata filter
filter_result = await client.delete_by_filter(search_options)
print(f"Deleted {filter_result.deleted_count} vectors by filter. Success: {filter_result.success}")
# Get database statistics
stats = await client.get_index_stats()
print(f"Database has {stats.vector_count} vectors with {stats.dimensions} dimensions")
if __name__ == "__main__":
asyncio.run(example())
- connect: Initialize the database with configuration
- insert: Add vectors with metadata
- search: Find similar vectors using cosine similarity
- deleteByIds: Remove vectors by their IDs
- deleteByFilter: Remove vectors that match specific filter criteria
- getIndexStats: Returns database statistics (vector count and dimensions)
- updateById: Update existing vectors with new metadata and/or vector data
- getPaginated: Retrieve paginated results from the database
TinyVecDB supports MongoDB-like query syntax for filtering vectors based on their metadata. These filters can be used with both the search
and deleteByFilter
methods.
$eq
: Matches values equal to a specified value$gt
: Matches values greater than a specified value$gte
: Matches values greater than or equal to a specified value$in
: Matches any values specified in an array$lt
: Matches values less than a specified value$lte
: Matches values less than or equal to a specified value$ne
: Matches values not equal to a specified value$nin
: Matches none of the values specified in an array
Filters can be nested for complex queries.
// Simple filter - find vectors with metadata where category equals "electronics"
const searchOptions: TinyVecSearchOptions = {
filter: { category: { $eq: "electronics" } },
};
// Array inclusion filter - find vectors with any of the specified tags
const inclusionFilter: TinyVecSearchOptions = {
filter: {
tags: { $in: ["premium", "featured", "new"] },
},
};
// Complex nested filter
const complexSearchOptions: TinyVecSearchOptions = {
filter: {
category: {
subcategory: {
value: { $gt: 200 },
},
},
tags: { $in: ["premium", "featured"] },
},
};
// Create a query vector
const queryVector = new Float32Array(512);
for (let i = 0; i < 512; i++) {
queryVector[i] = Math.random();
}
// Search with filter
const searchOptions: TinyVecSearchOptions = {
filter: { category: { $eq: "electronics" } },
};
const filteredResults = await client.search(queryVector, 10, searchOptions);
console.log("Found items in electronics category:", filteredResults.length);
// Delete all vectors with a specific category
const deleteOptions: TinyVecSearchOptions = {
filter: { category: { $eq: "outdated" } },
};
const deleteResult = await client.deleteByFilter(deleteOptions);
console.log(`Cleaned up ${deleteResult.deletedCount} outdated items`);
// Delete vectors with low ratings
const lowRatingOptions: TinyVecSearchOptions = {
filter: { rating: { $lt: 3.0 } },
};
const cleanupResult = await client.deleteByFilter(lowRatingOptions);
console.log(`Removed ${cleanupResult.deletedCount} low-rated items`);
- Python: Both NumPy arrays and Python lists are accepted, but all vectors are converted to NumPy float32 arrays before insertion for optimal performance.
- Node.js: Regular arrays and TypedArrays (Float32Array, etc.) are accepted, but all vectors are converted to Float32Array before insertion.
- Flexible metadata: You can include any serializable data in the metadata field, including deeply nested objects, arrays, primitive types, and complex data structures.
- Use cases: Ideal for storing document IDs, timestamps, text snippets, tags, category hierarchies, or any contextual information you need to retrieve with your vectors.
- Size considerations: While there's no strict size limit on metadata, keeping metadata compact will improve performance for large datasets.
- Automatic file creation: TinyVecDB automatically creates all necessary storage files when connecting if they don't exist.
- Dynamic dimension detection: If
dimensions
is not specified in the configuration, TinyVecDB will automatically use the length of the first inserted vector to update the database. - Flexible configuration: The config object is optional. When omitted, TinyVecDB will determine appropriate settings based on your first vector insertion while ensuring consistency for subsequent operations.
Option | Description | Default |
---|---|---|
path | Path to database file | required |
dimensions | Dimension of vectors | optional (derived from first vector) |
Good Use Cases | Less Ideal Use Cases |
---|---|
β Local semantic search in desktop apps | β Very complex multi-join query requireme |
β Rapid prototyping and MVPs | β Systems needing extensive admin tools/monitoring |
β Self-contained applications | β Cloud-native distributed architectures |
β Resource-constrained environments | β Applications requiring real-time replication |
- C Core Engine: Built with a native C core for maximum efficiency
- Minimal memory footprint
- Optimized similarity calculations with CPU-specific instructions
- Fast insert and query operations
Contributions are welcome! Please feel free to submit a Pull Request.
For more details on how to contribute to this project, please read our CONTRIBUTING.md guide.
This project is licensed under the MIT License - see the LICENSE file for details.