langchain-ai · jacoblee93 · May 17, 2024 · May 17, 2024 · May 17, 2024 · May 17, 2024
diff --git a/docs/core_docs/.gitignore b/docs/core_docs/.gitignore
@@ -95,6 +95,8 @@ docs/how_to/output_parser_structured.md
 docs/how_to/output_parser_structured.mdx
 docs/how_to/output_parser_json.md
 docs/how_to/output_parser_json.mdx
+docs/how_to/multiple_queries.md
+docs/how_to/multiple_queries.mdx
 docs/how_to/logprobs.md
 docs/how_to/logprobs.mdx
 docs/how_to/graph_semantic.md
@@ -138,6 +140,4 @@ docs/how_to/binding.mdx
 docs/how_to/assign.md
 docs/how_to/assign.mdx
 docs/how_to/agent_executor.md
-docs/how_to/agent_executor.mdx
-docs/how_to/MultiQueryRetriever.md
-docs/how_to/MultiQueryRetriever.mdx
+docs/how_to/agent_executor.mdx
diff --git a/docs/core_docs/docs/concepts.mdx b/docs/core_docs/docs/concepts.mdx
@@ -449,10 +449,10 @@ const retriever = vectorstore.asRetriever();
 
 ### Retrievers
 
-A retriever is an interface that returns documents given an unstructured query.
-It is more general than a vector store.
+A retriever is an interface that returns relevant documents given an unstructured query.
+They are more general than a vector store.
 A retriever does not need to be able to store documents, only to return (or retrieve) them.
-Retrievers can be created from vectorstores, but are also broad enough to include [Exa search](/docs/integrations/retrievers/exa/)(web search) and [Amazon Kendra](/docs/integrations/retrievers/kendra-retriever/).
+Retrievers can be created from vector stores, but are also broad enough to include [Exa search](/docs/integrations/retrievers/exa/)(web search) and [Amazon Kendra](/docs/integrations/retrievers/kendra-retriever/).
 
 Retrievers accept a string query as input and return an array of Document's as output.
 

diff --git a/docs/core_docs/docs/how_to/caching_embeddings.mdx b/docs/core_docs/docs/how_to/caching_embeddings.mdx
@@ -1,10 +1,17 @@
 import CodeBlock from "@theme/CodeBlock";
 import InMemoryExample from "@examples/embeddings/cache_backed_in_memory.ts";
-import ConvexExample from "@examples/embeddings/convex/cache_backed_convex.ts";
 import RedisExample from "@examples/embeddings/cache_backed_redis.ts";
 
 # How to cache embedding results
 
+:::info Prerequisites
+
+This guide assumes familiarity with the following concepts:
+
+- [Embeddings](/docs/concepts/#embedding-models)
+
+:::
+
 Embeddings can be stored or temporarily cached to avoid needing to recompute them.
 
 Caching embeddings can be done using a `CacheBackedEmbeddings` instance.
@@ -15,13 +22,13 @@ The text is hashed and the hash is used as the key in the cache.
 
 The main supported way to initialized a `CacheBackedEmbeddings` is the `fromBytesStore` static method. This takes in the following parameters:
 
-- `underlying_embedder`: The embedder to use for embedding.
-- `document_embedding_cache`: The cache to use for storing document embeddings.
-- `namespace`: (optional, defaults to "") The namespace to use for document cache. This namespace is used to avoid collisions with other caches. For example, set it to the name of the embedding model used.
+- `underlyingEmbeddings`: The embeddings model to use.
+- `documentEmbeddingCache`: The cache to use for storing document embeddings.
+- `namespace`: (optional, defaults to "") The namespace to use for document cache. This namespace is used to avoid collisions with other caches. For example, you could set it to the name of the embedding model used.
 
 **Attention:** Be sure to set the namespace parameter to avoid collisions of the same text embedded using different embeddings models.
 
-## Usage, in-memory
+## In-memory
 
 import IntegrationInstallTooltip from "@mdx_components/integration_install_tooltip.mdx";
 
@@ -36,47 +43,7 @@ Do not use this cache if you need to actually store the embeddings for an extend
 
 <CodeBlock language="typescript">{InMemoryExample}</CodeBlock>
 
-## Usage, Convex
-
-Here's an example with a [Convex](https://convex.dev/) as a cache.
-
-### Create project
-
-Get a working [Convex](https://docs.convex.dev/) project set up, for example by using:
-
-```bash
-npm create convex@latest
-```
-
-### Add database accessors
-
-Add query and mutation helpers to `convex/langchain/db.ts`:
-
-```ts title="convex/langchain/db.ts"
-export * from "langchain/util/convex";
-```
-
-### Configure your schema
-
-Set up your schema (for indexing):
-
-```ts title="convex/schema.ts"
-import { defineSchema, defineTable } from "convex/server";
-import { v } from "convex/values";
-
-export default defineSchema({
-  cache: defineTable({
-    key: v.string(),
-    value: v.any(),
-  }).index("byKey", ["key"]),
-});
-```
-
-### Example
-
-<CodeBlock language="typescript">{ConvexExample}</CodeBlock>
-
-## Usage, Redis
+## Redis
 
 Here's an example with a Redis cache.
 
@@ -87,3 +54,9 @@ npm install ioredis
 ```
 
 <CodeBlock language="typescript">{RedisExample}</CodeBlock>
+
+## Next steps
+
+You've now learned how to use caching to avoid recomputing embeddings.
+
+Next, check out the [full tutorial on retrieval-augmented generation](/docs/tutorials/rag).
diff --git a/docs/core_docs/docs/how_to/contextual_compression.mdx b/docs/core_docs/docs/how_to/contextual_compression.mdx
@@ -1,9 +1,14 @@
----
-hide_table_of_contents: true
----
-
 # How to do retrieval with contextual compression
 
+:::info Prerequisites
+
+This guide assumes familiarity with the following concepts:
+
+- [Retrievers](/docs/concepts/#retrievers)
+- [Retrieval-augmented generation (RAG)](/docs/tutorials/rag)
+
+:::
+
 One challenge with retrieval is that usually you don't know the specific queries your document storage system will face when you ingest data into the system. This means that the information most relevant to a query may be buried in a document with a lot of irrelevant text. Passing that full document through your application can lead to more expensive LLM calls and poorer responses.
 
 Contextual compression is meant to fix this. The idea is simple: instead of immediately returning retrieved documents as-is, you can compress them using the context of the given query, so that only the relevant information is returned. “Compressing” here refers to both compressing the contents of an individual document and filtering out documents wholesale.
@@ -58,3 +63,10 @@ This skips the need to add documents to a vector store to perform similarity sea
 import DocumentCompressorPipelineExample from "@examples/retrievers/document_compressor_pipeline.ts";
 
 <CodeBlock language="typescript">{DocumentCompressorPipelineExample}</CodeBlock>
+
+## Next steps
+
+You've now learned a few ways to use contextual compression to remove bad data from your results.
+
+See the individual sections for deeper dives on specific retrievers, the [broader tutorial on RAG](/docs/tutorials/rag), or this section to learn how to
+[create your own custom retriever over any data source](/docs/modules/data_connection/retrievers/custom).
diff --git a/docs/core_docs/docs/how_to/custom_retriever.mdx b/docs/core_docs/docs/how_to/custom_retriever.mdx
@@ -1,14 +1,17 @@
----
-hide_table_of_contents: true
-sidebar_position: 0
----
-
 # How to write a custom retriever class
 
-To create your own retriever, you need to extend the [`BaseRetriever` class](https://api.js.langchain.com/classes/langchain_core_retrievers.BaseRetriever.html)
-and implement a `_getRelevantDocuments` method that takes a `string` as its first parameter and an optional `runManager` for tracing.
-This method should return an array of `Document`s fetched from some source. This process can involve calls to a database or to the web using `fetch`.
-Note the underscore before `_getRelevantDocuments()` - the base class wraps the non-prefixed version in order to automatically handle tracing of the original call.
+:::info Prerequisites
+
+This guide assumes familiarity with the following concepts:
+
+- [Retrievers](/docs/concepts/#retrievers)
+
+:::
+
+To create your own retriever, you need to extend the [`BaseRetriever`](https://api.js.langchain.com/classes/langchain_core_retrievers.BaseRetriever.html) class
+and implement a `_getRelevantDocuments` method that takes a `string` as its first parameter (and an optional `runManager` for tracing).
+This method should return an array of `Document`s fetched from some source. This process can involve calls to a database, to the web using `fetch`, or any other source.
+Note the underscore before `_getRelevantDocuments()`. The base class wraps the non-prefixed version in order to automatically handle tracing of the original call.
 
 Here's an example of a custom retriever that returns static documents:
 
@@ -70,3 +73,9 @@ await retriever.invoke("LangChain docs");
   }
 ]
 ```
+
+## Next steps
+
+You've now seen an example of implementing your own custom retriever.
+
+Next, check out the individual sections for deeper dives on specific retrievers, or the [broader tutorial on RAG](/docs/tutorials/rag).
diff --git a/docs/core_docs/docs/how_to/embed_text.mdx b/docs/core_docs/docs/how_to/embed_text.mdx
@@ -8,16 +8,20 @@ sidebar_position: 2
 Head to [Integrations](/docs/integrations/text_embedding) for documentation on built-in integrations with text embedding providers.
 :::
 
-The Embeddings class is a class designed for interfacing with text embedding models. There are lots of embedding model providers (OpenAI, Cohere, Hugging Face, etc) - this class is designed to provide a standard interface for all of them.
+:::info Prerequisites
+
+This guide assumes familiarity with the following concepts:
+
+- [Embeddings](/docs/concepts/#embedding-models)
+
+:::
 
 Embeddings create a vector representation of a piece of text. This is useful because it means we can think about text in the vector space, and do things like semantic search where we look for pieces of text that are most similar in the vector space.
 
 The base Embeddings class in LangChain exposes two methods: one for embedding documents and one for embedding a query. The former takes as input multiple texts, while the latter takes a single text. The reason for having these as two separate methods is that some embedding providers have different embedding methods for documents (to be searched over) vs queries (the search query itself).
 
 ## Get started
 
-Embeddings can be used to create a numerical representation of textual data. This numerical representation is useful because it can be used to find similar documents.
-
 Below is an example of how to use the OpenAI embeddings. Embeddings occasionally have different embedding methods for queries versus documents, so the embedding class exposes a `embedQuery` and `embedDocuments` method.
 
 import IntegrationInstallTooltip from "@mdx_components/integration_install_tooltip.mdx";
@@ -77,3 +81,9 @@ const documentRes = await embeddings.embedDocuments(["Hello world", "Bye bye"]);
 ]
 */
 ```
+
+## Next steps
+
+You've now learned how to use embeddings models with queries and text.
+
+Next, check out how to [avoid excessively recomputing embeddings with caching](/docs/how_to/caching_embeddings), or the [full tutorial on retrieval-augmented generation](/docs/tutorials/rag).
diff --git a/docs/core_docs/docs/how_to/index.mdx b/docs/core_docs/docs/how_to/index.mdx
@@ -127,14 +127,14 @@ Embedding Models take a piece of text and create a numerical representation of i
 
 Vector stores are databases that can efficiently store and retrieve embeddings.
 
-- [How to: use a vector store to retrieve data](/docs/how_to/vectorstores)
+- [How to: create and query vector stores](/docs/how_to/vectorstores)
 
 ### Retrievers
 
 Retrievers are responsible for taking a query and returning relevant documents.
 
 - [How to: use a vector store to retrieve data](/docs/how_to/vectorstore_retriever)
-- [How to: generate multiple queries to retrieve data for](/docs/how_to/MultiQueryRetriever)
+- [How to: generate multiple queries to retrieve data for](/docs/how_to/multiple_queries)
 - [How to: use contextual compression to compress the data retrieved](/docs/how_to/contextual_compression)
 - [How to: write a custom retriever class](/docs/how_to/custom_retriever)
 - [How to: add similarity scores to retriever results](/docs/how_to/add_scores_retriever)
@@ -144,7 +144,7 @@ Retrievers are responsible for taking a query and returning relevant documents.
 - [How to: retrieve the whole document for a chunk](/docs/how_to/parent_document_retriever)
 - [How to: generate metadata filters](/docs/how_to/self_query)
 - [How to: create a time-weighted retriever](/docs/how_to/time_weighted_vectorstore)
-- [How to: use a Matryoshka retriever](/docs/how_to/matryoshka_retriever)
+- [How to: reduce retrieval latency](/docs/how_to/reduce_retrieval_latency)
 
 ### Indexing
 

diff --git a/docs/core_docs/docs/how_to/multi_vector.mdx b/docs/core_docs/docs/how_to/multi_vector.mdx
@@ -1,18 +1,24 @@
----
-hide_table_of_contents: true
----
-
 # How to generate multiple embeddings per document
 
-It can often be beneficial to store multiple vectors per document.
-LangChain has a base MultiVectorRetriever which makes querying this type of setup easier!
+:::info Prerequisites
+
+This guide assumes familiarity with the following concepts:
+
+- [Retrievers](/docs/concepts/#retrievers)
+- [Text splitters](/docs/concepts/#text-splitters)
+- [Retrieval-augmented generation (RAG)](/docs/tutorials/rag)
+
+:::
+
+Embedding different representations of an original document, then returning the original document when any of the representations result in a search hit, can allow you to
+tune and improve your retrieval performance. LangChain has a base [`MultiVectorRetriever`](https://api.js.langchain.com/classes/langchain_retrievers_multi_vector.MultiVectorRetriever.html) designed to do just this!
 
 A lot of the complexity lies in how to create the multiple vectors per document.
-This notebook covers some of the common ways to create those vectors and use the MultiVectorRetriever.
+This guide covers some of the common ways to create those vectors and use the `MultiVectorRetriever`.
 
 Some methods to create multiple vectors per document include:
 
-- smaller chunks: split a document into smaller chunks, and embed those (e.g. the [ParentDocumentRetriever](/docs/modules/data_connection/retrievers/parent-document-retriever))
+- smaller chunks: split a document into smaller chunks, and embed those (e.g. the [`ParentDocumentRetriever`](/docs/modules/data_connection/retrievers/parent-document-retriever))
 - summary: create a summary for each document, embed that along with (or instead of) the document
 - hypothetical questions: create hypothetical questions that each document would be appropriate to answer, embed those along with (or instead of) the document
 
@@ -54,3 +60,10 @@ These questions can then be embedded and used to retrieve the original document:
 import HypotheticalExample from "@examples/retrievers/multi_vector_hypothetical.ts";
 
 <CodeBlock language="typescript">{HypotheticalExample}</CodeBlock>
+
+## Next steps
+
+You've now learned a few ways to generate multiple embeddings per document.
+
+Next, check out the individual sections for deeper dives on specific retrievers, the [broader tutorial on RAG](/docs/tutorials/rag), or this section to learn how to
+[create your own custom retriever over any data source](/docs/modules/data_connection/retrievers/custom).