Skip to content

docs[patch]: Update retrieval and embeddings docs #5429

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions docs/core_docs/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,8 @@ docs/how_to/output_parser_structured.md
docs/how_to/output_parser_structured.mdx
docs/how_to/output_parser_json.md
docs/how_to/output_parser_json.mdx
docs/how_to/multiple_queries.md
docs/how_to/multiple_queries.mdx
docs/how_to/logprobs.md
docs/how_to/logprobs.mdx
docs/how_to/graph_semantic.md
Expand Down Expand Up @@ -138,6 +140,4 @@ docs/how_to/binding.mdx
docs/how_to/assign.md
docs/how_to/assign.mdx
docs/how_to/agent_executor.md
docs/how_to/agent_executor.mdx
docs/how_to/MultiQueryRetriever.md
docs/how_to/MultiQueryRetriever.mdx
docs/how_to/agent_executor.mdx
6 changes: 3 additions & 3 deletions docs/core_docs/docs/concepts.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -449,10 +449,10 @@ const retriever = vectorstore.asRetriever();

### Retrievers

A retriever is an interface that returns documents given an unstructured query.
It is more general than a vector store.
A retriever is an interface that returns relevant documents given an unstructured query.
They are more general than a vector store.
A retriever does not need to be able to store documents, only to return (or retrieve) them.
Retrievers can be created from vectorstores, but are also broad enough to include [Exa search](/docs/integrations/retrievers/exa/)(web search) and [Amazon Kendra](/docs/integrations/retrievers/kendra-retriever/).
Retrievers can be created from vector stores, but are also broad enough to include [Exa search](/docs/integrations/retrievers/exa/)(web search) and [Amazon Kendra](/docs/integrations/retrievers/kendra-retriever/).

Retrievers accept a string query as input and return an array of Document's as output.

Expand Down
65 changes: 19 additions & 46 deletions docs/core_docs/docs/how_to/caching_embeddings.mdx
Original file line number Diff line number Diff line change
@@ -1,10 +1,17 @@
import CodeBlock from "@theme/CodeBlock";
import InMemoryExample from "@examples/embeddings/cache_backed_in_memory.ts";
import ConvexExample from "@examples/embeddings/convex/cache_backed_convex.ts";
import RedisExample from "@examples/embeddings/cache_backed_redis.ts";

# How to cache embedding results

:::info Prerequisites

This guide assumes familiarity with the following concepts:

- [Embeddings](/docs/concepts/#embedding-models)

:::

Embeddings can be stored or temporarily cached to avoid needing to recompute them.

Caching embeddings can be done using a `CacheBackedEmbeddings` instance.
Expand All @@ -15,13 +22,13 @@ The text is hashed and the hash is used as the key in the cache.

The main supported way to initialized a `CacheBackedEmbeddings` is the `fromBytesStore` static method. This takes in the following parameters:

- `underlying_embedder`: The embedder to use for embedding.
- `document_embedding_cache`: The cache to use for storing document embeddings.
- `namespace`: (optional, defaults to "") The namespace to use for document cache. This namespace is used to avoid collisions with other caches. For example, set it to the name of the embedding model used.
- `underlyingEmbeddings`: The embeddings model to use.
- `documentEmbeddingCache`: The cache to use for storing document embeddings.
- `namespace`: (optional, defaults to "") The namespace to use for document cache. This namespace is used to avoid collisions with other caches. For example, you could set it to the name of the embedding model used.

**Attention:** Be sure to set the namespace parameter to avoid collisions of the same text embedded using different embeddings models.

## Usage, in-memory
## In-memory

import IntegrationInstallTooltip from "@mdx_components/integration_install_tooltip.mdx";

Expand All @@ -36,47 +43,7 @@ Do not use this cache if you need to actually store the embeddings for an extend

<CodeBlock language="typescript">{InMemoryExample}</CodeBlock>

## Usage, Convex

Here's an example with a [Convex](https://convex.dev/) as a cache.

### Create project

Get a working [Convex](https://docs.convex.dev/) project set up, for example by using:

```bash
npm create convex@latest
```

### Add database accessors

Add query and mutation helpers to `convex/langchain/db.ts`:

```ts title="convex/langchain/db.ts"
export * from "langchain/util/convex";
```

### Configure your schema

Set up your schema (for indexing):

```ts title="convex/schema.ts"
import { defineSchema, defineTable } from "convex/server";
import { v } from "convex/values";

export default defineSchema({
cache: defineTable({
key: v.string(),
value: v.any(),
}).index("byKey", ["key"]),
});
```

### Example

<CodeBlock language="typescript">{ConvexExample}</CodeBlock>

## Usage, Redis
## Redis

Here's an example with a Redis cache.

Expand All @@ -87,3 +54,9 @@ npm install ioredis
```

<CodeBlock language="typescript">{RedisExample}</CodeBlock>

## Next steps

You've now learned how to use caching to avoid recomputing embeddings.

Next, check out the [full tutorial on retrieval-augmented generation](/docs/tutorials/rag).
20 changes: 16 additions & 4 deletions docs/core_docs/docs/how_to/contextual_compression.mdx
Original file line number Diff line number Diff line change
@@ -1,9 +1,14 @@
---
hide_table_of_contents: true
---

# How to do retrieval with contextual compression

:::info Prerequisites

This guide assumes familiarity with the following concepts:

- [Retrievers](/docs/concepts/#retrievers)
- [Retrieval-augmented generation (RAG)](/docs/tutorials/rag)

:::

One challenge with retrieval is that usually you don't know the specific queries your document storage system will face when you ingest data into the system. This means that the information most relevant to a query may be buried in a document with a lot of irrelevant text. Passing that full document through your application can lead to more expensive LLM calls and poorer responses.

Contextual compression is meant to fix this. The idea is simple: instead of immediately returning retrieved documents as-is, you can compress them using the context of the given query, so that only the relevant information is returned. “Compressing” here refers to both compressing the contents of an individual document and filtering out documents wholesale.
Expand Down Expand Up @@ -58,3 +63,10 @@ This skips the need to add documents to a vector store to perform similarity sea
import DocumentCompressorPipelineExample from "@examples/retrievers/document_compressor_pipeline.ts";

<CodeBlock language="typescript">{DocumentCompressorPipelineExample}</CodeBlock>

## Next steps

You've now learned a few ways to use contextual compression to remove bad data from your results.

See the individual sections for deeper dives on specific retrievers, the [broader tutorial on RAG](/docs/tutorials/rag), or this section to learn how to
[create your own custom retriever over any data source](/docs/modules/data_connection/retrievers/custom).
27 changes: 18 additions & 9 deletions docs/core_docs/docs/how_to/custom_retriever.mdx
Original file line number Diff line number Diff line change
@@ -1,14 +1,17 @@
---
hide_table_of_contents: true
sidebar_position: 0
---

# How to write a custom retriever class

To create your own retriever, you need to extend the [`BaseRetriever` class](https://api.js.langchain.com/classes/langchain_core_retrievers.BaseRetriever.html)
and implement a `_getRelevantDocuments` method that takes a `string` as its first parameter and an optional `runManager` for tracing.
This method should return an array of `Document`s fetched from some source. This process can involve calls to a database or to the web using `fetch`.
Note the underscore before `_getRelevantDocuments()` - the base class wraps the non-prefixed version in order to automatically handle tracing of the original call.
:::info Prerequisites

This guide assumes familiarity with the following concepts:

- [Retrievers](/docs/concepts/#retrievers)

:::

To create your own retriever, you need to extend the [`BaseRetriever`](https://api.js.langchain.com/classes/langchain_core_retrievers.BaseRetriever.html) class
and implement a `_getRelevantDocuments` method that takes a `string` as its first parameter (and an optional `runManager` for tracing).
This method should return an array of `Document`s fetched from some source. This process can involve calls to a database, to the web using `fetch`, or any other source.
Note the underscore before `_getRelevantDocuments()`. The base class wraps the non-prefixed version in order to automatically handle tracing of the original call.

Here's an example of a custom retriever that returns static documents:

Expand Down Expand Up @@ -70,3 +73,9 @@ await retriever.invoke("LangChain docs");
}
]
```

## Next steps

You've now seen an example of implementing your own custom retriever.

Next, check out the individual sections for deeper dives on specific retrievers, or the [broader tutorial on RAG](/docs/tutorials/rag).
16 changes: 13 additions & 3 deletions docs/core_docs/docs/how_to/embed_text.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -8,16 +8,20 @@ sidebar_position: 2
Head to [Integrations](/docs/integrations/text_embedding) for documentation on built-in integrations with text embedding providers.
:::

The Embeddings class is a class designed for interfacing with text embedding models. There are lots of embedding model providers (OpenAI, Cohere, Hugging Face, etc) - this class is designed to provide a standard interface for all of them.
:::info Prerequisites

This guide assumes familiarity with the following concepts:

- [Embeddings](/docs/concepts/#embedding-models)

:::

Embeddings create a vector representation of a piece of text. This is useful because it means we can think about text in the vector space, and do things like semantic search where we look for pieces of text that are most similar in the vector space.

The base Embeddings class in LangChain exposes two methods: one for embedding documents and one for embedding a query. The former takes as input multiple texts, while the latter takes a single text. The reason for having these as two separate methods is that some embedding providers have different embedding methods for documents (to be searched over) vs queries (the search query itself).

## Get started

Embeddings can be used to create a numerical representation of textual data. This numerical representation is useful because it can be used to find similar documents.

Below is an example of how to use the OpenAI embeddings. Embeddings occasionally have different embedding methods for queries versus documents, so the embedding class exposes a `embedQuery` and `embedDocuments` method.

import IntegrationInstallTooltip from "@mdx_components/integration_install_tooltip.mdx";
Expand Down Expand Up @@ -77,3 +81,9 @@ const documentRes = await embeddings.embedDocuments(["Hello world", "Bye bye"]);
]
*/
```

## Next steps

You've now learned how to use embeddings models with queries and text.

Next, check out how to [avoid excessively recomputing embeddings with caching](/docs/how_to/caching_embeddings), or the [full tutorial on retrieval-augmented generation](/docs/tutorials/rag).
6 changes: 3 additions & 3 deletions docs/core_docs/docs/how_to/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -127,14 +127,14 @@ Embedding Models take a piece of text and create a numerical representation of i

Vector stores are databases that can efficiently store and retrieve embeddings.

- [How to: use a vector store to retrieve data](/docs/how_to/vectorstores)
- [How to: create and query vector stores](/docs/how_to/vectorstores)

### Retrievers

Retrievers are responsible for taking a query and returning relevant documents.

- [How to: use a vector store to retrieve data](/docs/how_to/vectorstore_retriever)
- [How to: generate multiple queries to retrieve data for](/docs/how_to/MultiQueryRetriever)
- [How to: generate multiple queries to retrieve data for](/docs/how_to/multiple_queries)
- [How to: use contextual compression to compress the data retrieved](/docs/how_to/contextual_compression)
- [How to: write a custom retriever class](/docs/how_to/custom_retriever)
- [How to: add similarity scores to retriever results](/docs/how_to/add_scores_retriever)
Expand All @@ -144,7 +144,7 @@ Retrievers are responsible for taking a query and returning relevant documents.
- [How to: retrieve the whole document for a chunk](/docs/how_to/parent_document_retriever)
- [How to: generate metadata filters](/docs/how_to/self_query)
- [How to: create a time-weighted retriever](/docs/how_to/time_weighted_vectorstore)
- [How to: use a Matryoshka retriever](/docs/how_to/matryoshka_retriever)
- [How to: reduce retrieval latency](/docs/how_to/reduce_retrieval_latency)

### Indexing

Expand Down
29 changes: 21 additions & 8 deletions docs/core_docs/docs/how_to/multi_vector.mdx
Original file line number Diff line number Diff line change
@@ -1,18 +1,24 @@
---
hide_table_of_contents: true
---

# How to generate multiple embeddings per document

It can often be beneficial to store multiple vectors per document.
LangChain has a base MultiVectorRetriever which makes querying this type of setup easier!
:::info Prerequisites

This guide assumes familiarity with the following concepts:

- [Retrievers](/docs/concepts/#retrievers)
- [Text splitters](/docs/concepts/#text-splitters)
- [Retrieval-augmented generation (RAG)](/docs/tutorials/rag)

:::

Embedding different representations of an original document, then returning the original document when any of the representations result in a search hit, can allow you to
tune and improve your retrieval performance. LangChain has a base [`MultiVectorRetriever`](https://api.js.langchain.com/classes/langchain_retrievers_multi_vector.MultiVectorRetriever.html) designed to do just this!

A lot of the complexity lies in how to create the multiple vectors per document.
This notebook covers some of the common ways to create those vectors and use the MultiVectorRetriever.
This guide covers some of the common ways to create those vectors and use the `MultiVectorRetriever`.

Some methods to create multiple vectors per document include:

- smaller chunks: split a document into smaller chunks, and embed those (e.g. the [ParentDocumentRetriever](/docs/modules/data_connection/retrievers/parent-document-retriever))
- smaller chunks: split a document into smaller chunks, and embed those (e.g. the [`ParentDocumentRetriever`](/docs/modules/data_connection/retrievers/parent-document-retriever))
- summary: create a summary for each document, embed that along with (or instead of) the document
- hypothetical questions: create hypothetical questions that each document would be appropriate to answer, embed those along with (or instead of) the document

Expand Down Expand Up @@ -54,3 +60,10 @@ These questions can then be embedded and used to retrieve the original document:
import HypotheticalExample from "@examples/retrievers/multi_vector_hypothetical.ts";

<CodeBlock language="typescript">{HypotheticalExample}</CodeBlock>

## Next steps

You've now learned a few ways to generate multiple embeddings per document.

Next, check out the individual sections for deeper dives on specific retrievers, the [broader tutorial on RAG](/docs/tutorials/rag), or this section to learn how to
[create your own custom retriever over any data source](/docs/modules/data_connection/retrievers/custom).
Loading
Loading