Skip to content
Davidyz edited this page Apr 1, 2025 · 4 revisions

This page documents some usage tips that may improve your experience with VectorCode.

CLI

Embedding & Reranking

The quality of the embeddings plays an important role on the quality of the query results. However, just like LLMs, you also have to compromise between inference cost and quality of output. The default embedding model for this project is all-MiniLM-L6-v2. It offers a good balance between inference time and quality. However, if you're working on a capable machine, you may explore sentence transformers documentation for more models that will work right away with VectorCode. You can also use other embedding functions from chromadb to connect to cloud-hosted embedding providers, which will very likely offer much more complex models, but at the cost of privacy.

Rerankers will compare query result candidates against your search queries and determine which chunks are the best. If you're experiencing poor quality output, you may also find it helpful to try some of the rerankers (cross encoders) from their documentation.

Collection Parameters

Collections is the abstraction layer of repositories in the vector database (and thus, VectorCode). Each repository have its own collection, and they don't interfere with each other. When creating a new collection, you can set various options to configure the collection. If you run into errors on some query operations but not all of them, altering these parameters may help prevent the issue.

Chunking Parameters

By default, VectorCode doesn't perform chunking when vectorising files. This works fine on small files but may cause lost of information on large files. To mitigate this, you may set the chunk_size option in .vectorcode/config.json to a value that works for your embedding model. This'll split the files into chunks with no more than chunk_size characters. The value should be chosen based on the embedding model. For example, for the default embedding model (sentence-transformers/all-MiniLM-L6-v2), the input will be truncated to 256 words. A larger chunk size will save time during queries because there will be fewer chunks to look at, but at the cost of potentially losing information.

For supported languages (see pygments doucmentation and tree-sitter-language-pack), VectorCode will semantically chunk the documents with treesitter. For unsupported documents (failed to guess the format, or format unsupported by tree-sitter-language-pack), VectorCode performs a sliding-window chunking algorithm on the text. By default, adjacent chunks will overlap to avoid breaking information mid-sentence. This is controlled by the overlap_ratio option (configurable in both the JSON and as a command line flag). A chunking process with chunk_size=1000 and overlap_ratio=0.2 will produce a sequence of chunks with 1000 characters each, with ~200 characters repeated to the previous/next chunks. A larger overlap_ratio will improve the coherences of chunks because it's less likely that sentences will be broken, but it'll also lead to more chunks, increasing the time taken for queries.

Querying Parameters

The query results are organised into files, but the texts are stored as "chunks" (pieces of files). When querying the database, VectorCode queries many chunks and compute a score for each of the files present in the query results based on the relevance between the chunks and the query messages. You can control the query behaviour by the --multipler command line flag (or the query_multiplier option in the JSON config). When -n is set to $n$ and the multiplier is set to $m$, VectorCode will query for at most $m \times n$ chunks and figure out the most relevant $n$ files from these chunks. This helps avoiding chunks with low relevance scores, especially if there are many chunks from the same file that have a wide range of scores (a chunk of this document is highly relevant, but other chunks are not). However, if you set the multiplier to a very small value, you're taking the risk of not reranking enough of the chunks for a fair result.

Neovim Plugin

Backends

By default, the neovim plugin calls VectorCode by running commands by vim.system or require("plenary.job"). This initialises a new process for each query, and all dependencies in the CLI have to be re-imported for every process. This is VERY SLOW. To improve the query time, consider using the LSP backend, which keeps a process running alongside the editor. This avoids the re-importing of libraries, and hence accelerates the query. This applies to async caching, the CodeCompanion.nvim tool and the CopilotChat context provider.

Miscs

Git Hooks

For VectorCode to work nicely, you need to make sure the embeddings in the database are up-to-date. Here's how I do it for the VectorCode repository:

  • Create .vectorcode/vectorcode.include and .vectorcode/vectorcode.exclude files to mark files that I want to include and exclude in the embeddings. They follow the same syntax as the .gitignore files. See more about how the 2 specs work here;
  • Run vectorcode vectorise from the CLI manually at least once so that files matched by .vectorcode/vectorcode.include can be vectorised. I only do this when setting up VectorCode for a new project;
  • Add the following pre-commit hook:
    diff_files=$(git diff --cached --name-only)
    [ -z "$diff_files" ] || vectorcode vectorise $diff_files

This triggers vectorcode vectorise on files that you modified in the commit.

  • If you're working with different branches, add the following post-checkout hook:

    files=$(git diff --name-only "$1" "$2")
    [ -z "$files" ] || vectorcode vectorise $files

    This triggers vectorcode vectorise on files that are changed followed by a checkout (which also includes merge, rebase, etc.).

The above git hooks help maintain the embeddings in the database up-to-date and reflect the latest status of your codebase.

Commands Hang Forever

The default embedding engine uses the sentence transformers library, which uses transformers behind the scene. This could cause the command to hang forever if you don't have steady access to huggingface. To fix this, make sure you pass HTTP_PROXY and/or HTTPS_PROXY environment variables when you run the command from CLI or start the LSP/MCP server.

For example, when using the LSP mode in neovim, you can use the following snippet to make sure that your HTTP_PROXY and HTTPS_PROXY variables are passed to the subprocess that runs the LSP:

vim.lsp.config("vectorcode_server", {
  cmd_env = {
    HTTP_PROXY = os.getenv("HTTP_PROXY"),
    HTTPS_PROXY = os.getenv("HTTPS_PROXY"),
  },
})