-
Notifications
You must be signed in to change notification settings - Fork 15
Tips and Tricks
This page documents some usage tips that may improve your experience with VectorCode.
The quality of the embeddings plays an important role on the quality of the query results. However, just like LLMs, you also have to compromise between inference cost and quality of output. The default embedding model for this project is all-MiniLM-L6-v2. It offers a good balance between inference time and quality. However, if you're working on a capable machine, you may explore sentence transformers documentation for more models that will work right away with VectorCode. You can also use other embedding functions from chromadb to connect to cloud-hosted embedding providers, which will very likely offer much more complex models, but at the cost of privacy.
Rerankers will compare query result candidates against your search queries and determine which chunks are the best. If you're experiencing poor quality output, you may also find it helpful to try some of the rerankers (cross encoders) from their documentation.
Collections is the abstraction layer of repositories in the vector database (and thus, VectorCode). Each repository have its own collection, and they don't interfere with each other. When creating a new collection, you can set various options to configure the collection. If you run into errors on some query operations but not all of them, altering these parameters may help prevent the issue.
By default, VectorCode doesn't perform chunking when vectorising files. This
works fine on small files but may cause lost of information on large files. To
mitigate this, you may set the chunk_size
option in .vectorcode/config.json
to a value that works for your embedding model. This'll split the files into
chunks with no more than chunk_size
characters. The value should be chosen
based on the embedding model. For example, for the default embedding model
(sentence-transformers/all-MiniLM-L6-v2
), the input will be truncated to 256
words. A larger chunk size will save time during queries because there will be
fewer chunks to look at, but at the cost of potentially losing information.
For supported languages
(see pygments doucmentation and
tree-sitter-language-pack),
VectorCode will semantically chunk the documents with treesitter. For
unsupported documents (failed to guess the format, or format unsupported by
tree-sitter-language-pack), VectorCode performs a sliding-window chunking
algorithm on the text. By default, adjacent chunks will overlap to avoid
breaking information mid-sentence. This is controlled by the overlap_ratio
option (configurable in both the JSON and as a command line flag). A
chunking process with chunk_size=1000
and overlap_ratio=0.2
will produce a
sequence of chunks with 1000 characters each, with ~200 characters repeated to
the previous/next chunks. A larger overlap_ratio
will improve the coherences
of chunks because it's less likely that sentences will be broken, but it'll also
lead to more chunks, increasing the time taken for queries.
The query results are organised into files, but the texts are stored as "chunks"
(pieces of files). When querying the database, VectorCode queries many chunks and
compute a score for each of the files present in the query results based on the
relevance between the chunks and the query messages. You can control the query
behaviour by the --multipler
command line flag (or the query_multiplier
option in the JSON config). When -n
is set to
By default, the neovim plugin calls VectorCode by running commands by
vim.system
or require("plenary.job")
. This initialises a new process for
each query, and all dependencies in the CLI have to be re-imported for every
process. This is VERY SLOW. To improve the query time, consider using the LSP
backend,
which keeps a process running alongside the editor. This avoids the re-importing
of libraries, and hence accelerates the query. This applies to async caching,
the CodeCompanion.nvim tool and the CopilotChat context provider.
For VectorCode to work nicely, you need to make sure the embeddings in the database are up-to-date. Here's how I do it for the VectorCode repository:
- Create
.vectorcode/vectorcode.include
and.vectorcode/vectorcode.exclude
files to mark files that I want to include and exclude in the embeddings. They follow the same syntax as the.gitignore
files. See more about how the 2 specs work here; - Run
vectorcode vectorise
from the CLI manually at least once so that files matched by.vectorcode/vectorcode.include
can be vectorised. I only do this when setting up VectorCode for a new project; - Add the following pre-commit hook:
diff_files=$(git diff --cached --name-only) [ -z "$diff_files" ] || vectorcode vectorise $diff_files
This triggers vectorcode vectorise
on files that you modified in the commit.
-
If you're working with different branches, add the following post-checkout hook:
files=$(git diff --name-only "$1" "$2") [ -z "$files" ] || vectorcode vectorise $files
This triggers
vectorcode vectorise
on files that are changed followed by acheckout
(which also includesmerge
,rebase
, etc.).
The above git hooks help maintain the embeddings in the database up-to-date and reflect the latest status of your codebase.
The default embedding engine uses the sentence transformers library, which uses
transformers behind the
scene. This could cause the command to hang forever if you don't have steady
access to huggingface. To fix this, make sure you pass HTTP_PROXY
and/or
HTTPS_PROXY
environment variables when you run the command from CLI or start
the LSP/MCP server.
For example, when using the LSP mode in neovim, you can use the following
snippet to make sure that your HTTP_PROXY
and HTTPS_PROXY
variables are
passed to the subprocess that runs the LSP:
vim.lsp.config("vectorcode_server", {
cmd_env = {
HTTP_PROXY = os.getenv("HTTP_PROXY"),
HTTPS_PROXY = os.getenv("HTTPS_PROXY"),
},
})