Open
Description
We could add an option to save embeddings with binary quantization, giving a 32x storage space reduction with often only a small performance pentalty: https://emschwartz.me/binary-vector-embeddings-are-so-cool/
Similar to the truncate function, embeddings could have a button that makes a quantized version.
We would need to note the quantization in the metadata so that UMAP and the nearest neighbor search code could use hamming distance instead of cosine similarity.
This could also help with token level embeddings #64 because of the potential large increase in storage costs