Skip to content

Commit 37d5496

Browse files
sinediedjacoblee93
andauthored
community[minor], azure-cosmosdb[major]: add new integration for Azure CosmosDB for NoSQL (#6133)
* refactor: rename azure cosmosdb for mongodb vcore integration * feat: add compatibility import * refactor: update old imports * feat: add Azure CosmosDB for NoSQL VectorStore integration * fix: connection string * refactor: formatting * chore: add imports * fix: connection string parsing * refactor: do not use bulk for document creation * fix: split delete operations into batches of 100 * fix: search results and metadata storage * docs: add Azure CosmosDB for NoSQL example * fix: embedding key init * fix: filter where clause * refactor: improve delete * fix: delete * fix: mmr search * docs: fix typo * test: add integration tests * refactor: format and lint * fix: id usage during create * test: add unit tests * chore: create new langchain-azure-cosmosdb package * chore: add deprecated message to community vcore * refactor: migrate code to the new pakcage * chore: update example env file * test: update tests * chore: clean up * docs: update docs to use the new package * chore: update examples to use the new package * chore: lint/formatting * test: fix wrong id format * fix: wrong naming * feat: add support for managed identity * docs: add note on managed identity * test: add unit test for managed identity * chore: fix package name * chore: fix mongodb version * chore: revert old integration renaming * chore: revert old package renaming (part 2) * chore: revert renaming old integration (part 3) * feat: make filters more generic to allow offset limit filters * chore: update lockfile * fix: example name * chore: remove todo * chore: update langchain versions * refactor: defer init outside of constructor * refactor: defer init to initialize method * chore: update lockfile * refactor: rename CosmosDB for MongoDB integration * docs: add related links (auto) * Lint + format * Fix build --------- Co-authored-by: jacoblee93 <[email protected]>
1 parent 3a2c713 commit 37d5496

33 files changed

+3920
-56
lines changed

docs/core_docs/.gitignore

+4
Original file line numberDiff line numberDiff line change
@@ -212,8 +212,12 @@ docs/how_to/assign.md
212212
docs/how_to/assign.mdx
213213
docs/how_to/agent_executor.md
214214
docs/how_to/agent_executor.mdx
215+
docs/integrations/tools/duckduckgo_search.md
216+
docs/integrations/tools/duckduckgo_search.mdx
215217
docs/integrations/toolkits/sql.md
216218
docs/integrations/toolkits/sql.mdx
219+
docs/integrations/toolkits/openapi.md
220+
docs/integrations/toolkits/openapi.mdx
217221
docs/integrations/text_embedding/togetherai.md
218222
docs/integrations/text_embedding/togetherai.mdx
219223
docs/integrations/text_embedding/openai.md

docs/core_docs/docs/integrations/platforms/microsoft.mdx

+20-4
Original file line numberDiff line numberDiff line change
@@ -100,20 +100,36 @@ See a [usage example](/docs/integrations/vectorstores/azure_aisearch).
100100
import { AzureAISearchVectorStore } from "@langchain/community/vectorstores/azure_aisearch";
101101
```
102102

103+
### Azure Cosmos DB for NoSQL
104+
105+
> [Azure Cosmos DB for NoSQL](https://learn.microsoft.com/azure/cosmos-db/nosql/) provides support for querying items with flexible schemas and native support for JSON. It now offers vector indexing and search. This feature is designed to handle high-dimensional vectors, enabling efficient and accurate vector search at any scale. You can now store vectors directly in the documents alongside your data. Each document in your database can contain not only traditional schema-free data, but also high-dimensional vectors as other properties of the documents.
106+
107+
<IntegrationInstallTooltip></IntegrationInstallTooltip>
108+
109+
```bash npm2yarn
110+
npm install @langchain/azure-cosmosdb
111+
```
112+
113+
See a [usage example](/docs/integrations/vectorstores/azure_cosmosdb_nosql).
114+
115+
```typescript
116+
import { AzureCosmosDBNoSQLVectorStore } from "@langchain/azure-cosmosdb";
117+
```
118+
103119
### Azure Cosmos DB for MongoDB vCore
104120

105-
> [Azure Cosmos DB for MongoDB vCore](https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/vcore/) makes it easy to create a database with full native MongoDB support. You can apply your MongoDB experience and continue to use your favorite MongoDB drivers, SDKs, and tools by pointing your application to the API for MongoDB vCore account’s connection string. Use vector search in Azure Cosmos DB for MongoDB vCore to seamlessly integrate your AI-based applications with your data that’s stored in Azure Cosmos DB.
121+
> [Azure Cosmos DB for MongoDB vCore](https://learn.microsoft.com/azure/cosmos-db/mongodb/vcore/) makes it easy to create a database with full native MongoDB support. You can apply your MongoDB experience and continue to use your favorite MongoDB drivers, SDKs, and tools by pointing your application to the API for MongoDB vCore account’s connection string. Use vector search in Azure Cosmos DB for MongoDB vCore to seamlessly integrate your AI-based applications with your data that’s stored in Azure Cosmos DB.
106122
107123
<IntegrationInstallTooltip></IntegrationInstallTooltip>
108124

109125
```bash npm2yarn
110-
npm install @langchain/community mongodb
126+
npm install @langchain/azure-cosmosdb
111127
```
112128

113-
See a [usage example](/docs/integrations/vectorstores/azure_cosmosdb).
129+
See a [usage example](/docs/integrations/vectorstores/azure_cosmosdb_mongodb).
114130

115131
```typescript
116-
import { AzureCosmosDBVectorStore } from "@langchain/community/vectorstores/azure_cosmosdb";
132+
import { AzureCosmosDBMongoDBVectorStore } from "@langchain/azure-cosmosdb";
117133
```
118134

119135
## Document loaders

docs/core_docs/docs/integrations/vectorstores/azure_cosmosdb.mdx renamed to docs/core_docs/docs/integrations/vectorstores/azure_cosmosdb_mongodb.mdx

+6-6
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,29 @@
1-
# Azure Cosmos DB
1+
# Azure Cosmos DB for MongoDB vCore
22

3-
> [Azure Cosmos DB for MongoDB vCore](https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/vcore/) makes it easy to create a database with full native MongoDB support. You can apply your MongoDB experience and continue to use your favorite MongoDB drivers, SDKs, and tools by pointing your application to the API for MongoDB vCore account’s connection string. Use vector search in Azure Cosmos DB for MongoDB vCore to seamlessly integrate your AI-based applications with your data that’s stored in Azure Cosmos DB.
3+
> [Azure Cosmos DB for MongoDB vCore](https://learn.microsoft.com/azure/cosmos-db/mongodb/vcore/) makes it easy to create a database with full native MongoDB support. You can apply your MongoDB experience and continue to use your favorite MongoDB drivers, SDKs, and tools by pointing your application to the API for MongoDB vCore account’s connection string. Use vector search in Azure Cosmos DB for MongoDB vCore to seamlessly integrate your AI-based applications with your data that’s stored in Azure Cosmos DB.
44
55
Azure Cosmos DB for MongoDB vCore provides developers with a fully managed MongoDB-compatible database service for building modern applications with a familiar architecture.
66

77
Learn how to leverage the vector search capabilities of Azure Cosmos DB for MongoDB vCore from [this page](https://learn.microsoft.com/azure/cosmos-db/mongodb/vcore/vector-search). If you don't have an Azure account, you can [create a free account](https://azure.microsoft.com/free/) to get started.
88

99
## Setup
1010

11-
You'll first need to install the `mongodb` SDK and the [`@langchain/community`](https://www.npmjs.com/package/@langchain/community) package:
11+
You'll first need to install the [`@langchain/azure-cosmosdb`](https://www.npmjs.com/package/@langchain/azure-cosmosdb) package:
1212

1313
import IntegrationInstallTooltip from "@mdx_components/integration_install_tooltip.mdx";
1414

1515
<IntegrationInstallTooltip></IntegrationInstallTooltip>
1616

1717
```bash npm2yarn
18-
npm install @langchain/community mongodb
18+
npm install @langchain/azure-cosmosdb
1919
```
2020

2121
You'll also need to have an Azure Cosmos DB for MongoDB vCore instance running. You can deploy a free version on Azure Portal without any cost, following [this guide](https://learn.microsoft.com/azure/cosmos-db/mongodb/vcore/quickstart-portal).
2222

2323
Once you have your instance running, make sure you have the connection string and the admin key. You can find them in the Azure Portal, under the "Connection strings" section of your instance. Then you need to set the following environment variables:
2424

2525
import CodeBlock from "@theme/CodeBlock";
26-
import EnvVars from "@examples/indexes/vector_stores/azure_cosmosdb/.env.example";
26+
import EnvVars from "@examples/indexes/vector_stores/azure_cosmosdb_mongodb/.env.example";
2727

2828
<CodeBlock language="text">{EnvVars}</CodeBlock>
2929

@@ -32,7 +32,7 @@ import EnvVars from "@examples/indexes/vector_stores/azure_cosmosdb/.env.example
3232
Below is an example that indexes documents from a file in Azure Cosmos DB for MongoDB vCore, runs a vector search query, and finally uses a chain to answer a question in natural language
3333
based on the retrieved documents.
3434

35-
import Example from "@examples/indexes/vector_stores/azure_cosmosdb/azure_cosmosdb.ts";
35+
import Example from "@examples/indexes/vector_stores/azure_cosmosdb_mongodb/azure_cosmosdb_mongodb.ts";
3636

3737
<CodeBlock language="typescript">{Example}</CodeBlock>
3838

Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
# Azure Cosmos DB for NoSQL
2+
3+
> [Azure Cosmos DB for NoSQL](https://learn.microsoft.com/azure/cosmos-db/nosql/) provides support for querying items with flexible schemas and native support for JSON. It now offers vector indexing and search. This feature is designed to handle high-dimensional vectors, enabling efficient and accurate vector search at any scale. You can now store vectors directly in the documents alongside your data. Each document in your database can contain not only traditional schema-free data, but also high-dimensional vectors as other properties of the documents.
4+
5+
Learn how to leverage the vector search capabilities of Azure Cosmos DB for NoSQL from [this page](https://learn.microsoft.com/azure/cosmos-db/nosql/vector-search). If you don't have an Azure account, you can [create a free account](https://azure.microsoft.com/free/) to get started.
6+
7+
## Setup
8+
9+
You'll first need to install the [`@langchain/azure-cosmosdb`](https://www.npmjs.com/package/@langchain/azure-cosmosdb) package:
10+
11+
import IntegrationInstallTooltip from "@mdx_components/integration_install_tooltip.mdx";
12+
13+
<IntegrationInstallTooltip></IntegrationInstallTooltip>
14+
15+
```bash npm2yarn
16+
npm install @langchain/azure-cosmosdb
17+
```
18+
19+
You'll also need to have an Azure Cosmos DB for NoSQL instance running. You can deploy a free version on Azure Portal without any cost, following [this guide](https://learn.microsoft.com/azure/cosmos-db/nosql/quickstart-portal).
20+
21+
Once you have your instance running, make sure you have the connection string. You can find them in the Azure Portal, under the "Settings / Keys" section of your instance. Then you need to set the following environment variables:
22+
23+
import CodeBlock from "@theme/CodeBlock";
24+
import EnvVars from "@examples/indexes/vector_stores/azure_cosmosdb_nosql/.env.example";
25+
26+
<CodeBlock language="text">{EnvVars}</CodeBlock>
27+
28+
### Using Azure Managed Identity
29+
30+
If you're using Azure Managed Identity, you can configure the credentials like this:
31+
32+
import ManagedIdentityExample from "@examples/indexes/vector_stores/azure_cosmosdb_nosql/azure_cosmosdb_nosql-managed_identity.ts";
33+
34+
<CodeBlock language="typescript">{ManagedIdentityExample}</CodeBlock>
35+
36+
:::info
37+
38+
When using Azure Managed Identity and role-based access control, you must ensure that the database and container have been created beforehand. RBAC does not provide permissions to create databases and containers. You can get more information about the permission model in the [Azure Cosmos DB documentation](https://learn.microsoft.com/azure/cosmos-db/how-to-setup-rbac#permission-model).
39+
40+
:::
41+
42+
## Usage example
43+
44+
Below is an example that indexes documents from a file in Azure Cosmos DB for NoSQL, runs a vector search query, and finally uses a chain to answer a question in natural language
45+
based on the retrieved documents.
46+
47+
import Example from "@examples/indexes/vector_stores/azure_cosmosdb_nosql/azure_cosmosdb_nosql.ts";
48+
49+
<CodeBlock language="typescript">{Example}</CodeBlock>
50+
51+
## Related
52+
53+
- Vector store [conceptual guide](/docs/concepts/#vectorstores)
54+
- Vector store [how-to guides](/docs/how_to/#vectorstores)

examples/package.json

+1
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@
3535
"@google/generative-ai": "^0.7.0",
3636
"@langchain/anthropic": "workspace:*",
3737
"@langchain/aws": "workspace:*",
38+
"@langchain/azure-cosmosdb": "workspace:*",
3839
"@langchain/azure-dynamic-sessions": "workspace:^",
3940
"@langchain/azure-openai": "workspace:*",
4041
"@langchain/baidu-qianfan": "workspace:*",
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
AZURE_COSMOSDB_MONGODB_CONNECTION_STRING=

examples/src/indexes/vector_stores/azure_cosmosdb/azure_cosmosdb.ts renamed to examples/src/indexes/vector_stores/azure_cosmosdb_mongodb/azure_cosmosdb_mongodb.ts

+6-6
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
import {
2-
AzureCosmosDBVectorStore,
3-
AzureCosmosDBSimilarityType,
4-
} from "@langchain/community/vectorstores/azure_cosmosdb";
2+
AzureCosmosDBMongoDBVectorStore,
3+
AzureCosmosDBMongoDBSimilarityType,
4+
} from "@langchain/azure-cosmosdb";
55
import { ChatPromptTemplate } from "@langchain/core/prompts";
66
import { ChatOpenAI, OpenAIEmbeddings } from "@langchain/openai";
77
import { createStuffDocumentsChain } from "langchain/chains/combine_documents";
@@ -18,8 +18,8 @@ const splitter = new RecursiveCharacterTextSplitter({
1818
});
1919
const documents = await splitter.splitDocuments(rawDocuments);
2020

21-
// Create Azure Cosmos DB vector store
22-
const store = await AzureCosmosDBVectorStore.fromDocuments(
21+
// Create Azure Cosmos DB for MongoDB vCore vector store
22+
const store = await AzureCosmosDBMongoDBVectorStore.fromDocuments(
2323
documents,
2424
new OpenAIEmbeddings(),
2525
{
@@ -28,7 +28,7 @@ const store = await AzureCosmosDBVectorStore.fromDocuments(
2828
indexOptions: {
2929
numLists: 100,
3030
dimensions: 1536,
31-
similarity: AzureCosmosDBSimilarityType.COS,
31+
similarity: AzureCosmosDBMongoDBSimilarityType.COS,
3232
},
3333
}
3434
);
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
# Use connection string to authenticate
2+
AZURE_COSMOSDB_NOSQL_CONNECTION_STRING=
3+
4+
# Use managed identity to authenticate
5+
AZURE_COSMOSDB_NOSQL_ENDPOINT=
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
import { AzureCosmosDBNoSQLVectorStore } from "@langchain/azure-cosmosdb";
2+
import { OpenAIEmbeddings } from "@langchain/openai";
3+
4+
// Create Azure Cosmos DB vector store
5+
const store = new AzureCosmosDBNoSQLVectorStore(new OpenAIEmbeddings(), {
6+
// Or use environment variable AZURE_COSMOSDB_NOSQL_ENDPOINT
7+
endpoint: "https://my-cosmosdb.documents.azure.com:443/",
8+
9+
// Database and container must already exist
10+
databaseName: "my-database",
11+
containerName: "my-container",
12+
});
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
import { AzureCosmosDBNoSQLVectorStore } from "@langchain/azure-cosmosdb";
2+
import { ChatPromptTemplate } from "@langchain/core/prompts";
3+
import { ChatOpenAI, OpenAIEmbeddings } from "@langchain/openai";
4+
import { createStuffDocumentsChain } from "langchain/chains/combine_documents";
5+
import { createRetrievalChain } from "langchain/chains/retrieval";
6+
import { TextLoader } from "langchain/document_loaders/fs/text";
7+
import { RecursiveCharacterTextSplitter } from "@langchain/textsplitters";
8+
9+
// Load documents from file
10+
const loader = new TextLoader("./state_of_the_union.txt");
11+
const rawDocuments = await loader.load();
12+
const splitter = new RecursiveCharacterTextSplitter({
13+
chunkSize: 1000,
14+
chunkOverlap: 0,
15+
});
16+
const documents = await splitter.splitDocuments(rawDocuments);
17+
18+
// Create Azure Cosmos DB vector store
19+
const store = await AzureCosmosDBNoSQLVectorStore.fromDocuments(
20+
documents,
21+
new OpenAIEmbeddings(),
22+
{
23+
databaseName: "langchain",
24+
containerName: "documents",
25+
}
26+
);
27+
28+
// Performs a similarity search
29+
const resultDocuments = await store.similaritySearch(
30+
"What did the president say about Ketanji Brown Jackson?"
31+
);
32+
33+
console.log("Similarity search results:");
34+
console.log(resultDocuments[0].pageContent);
35+
/*
36+
Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections.
37+
38+
Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service.
39+
40+
One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court.
41+
42+
And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.
43+
*/
44+
45+
// Use the store as part of a chain
46+
const model = new ChatOpenAI({ model: "gpt-3.5-turbo-1106" });
47+
const questionAnsweringPrompt = ChatPromptTemplate.fromMessages([
48+
[
49+
"system",
50+
"Answer the user's questions based on the below context:\n\n{context}",
51+
],
52+
["human", "{input}"],
53+
]);
54+
55+
const combineDocsChain = await createStuffDocumentsChain({
56+
llm: model,
57+
prompt: questionAnsweringPrompt,
58+
});
59+
60+
const chain = await createRetrievalChain({
61+
retriever: store.asRetriever(),
62+
combineDocsChain,
63+
});
64+
65+
const res = await chain.invoke({
66+
input: "What is the president's top priority regarding prices?",
67+
});
68+
69+
console.log("Chain response:");
70+
console.log(res.answer);
71+
/*
72+
The president's top priority is getting prices under control.
73+
*/
74+
75+
// Clean up
76+
await store.delete();

examples/src/indexes/vector_stores/azure_cosmosdb/.env.example renamed to libs/langchain-azure-cosmosdb/.env.example

+7-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,11 @@
1+
# Azure CosmosDB for NoSQL connection string
2+
AZURE_COSMOSDB_NOSQL_CONNECTION_STRING=
3+
4+
# Azure CosmosDB for NoSQL endpoint (if you're using managed identity)
5+
AZURE_COSMOSDB_NOSQL_ENDPOINT=
6+
17
# Azure CosmosDB for MongoDB vCore connection string
2-
AZURE_COSMOSDB_CONNECTION_STRING=
8+
AZURE_COSMOSDB_MONGODB_CONNECTION_STRING=
39

410
# If you're using Azure OpenAI API, you'll need to set these variables
511
AZURE_OPENAI_API_KEY=
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
module.exports = {
2+
extends: [
3+
"airbnb-base",
4+
"eslint:recommended",
5+
"prettier",
6+
"plugin:@typescript-eslint/recommended",
7+
],
8+
parserOptions: {
9+
ecmaVersion: 12,
10+
parser: "@typescript-eslint/parser",
11+
project: "./tsconfig.json",
12+
sourceType: "module",
13+
},
14+
plugins: ["@typescript-eslint", "no-instanceof"],
15+
ignorePatterns: [
16+
".eslintrc.cjs",
17+
"scripts",
18+
"node_modules",
19+
"dist",
20+
"dist-cjs",
21+
"*.js",
22+
"*.cjs",
23+
"*.d.ts",
24+
],
25+
rules: {
26+
"no-process-env": 2,
27+
"no-instanceof/no-instanceof": 2,
28+
"@typescript-eslint/explicit-module-boundary-types": 0,
29+
"@typescript-eslint/no-empty-function": 0,
30+
"@typescript-eslint/no-shadow": 0,
31+
"@typescript-eslint/no-empty-interface": 0,
32+
"@typescript-eslint/no-use-before-define": ["error", "nofunc"],
33+
"@typescript-eslint/no-unused-vars": ["warn", { args: "none" }],
34+
"@typescript-eslint/no-floating-promises": "error",
35+
"@typescript-eslint/no-misused-promises": "error",
36+
camelcase: 0,
37+
"class-methods-use-this": 0,
38+
"import/extensions": [2, "ignorePackages"],
39+
"import/no-extraneous-dependencies": [
40+
"error",
41+
{ devDependencies: ["**/*.test.ts"] },
42+
],
43+
"import/no-unresolved": 0,
44+
"import/prefer-default-export": 0,
45+
"keyword-spacing": "error",
46+
"max-classes-per-file": 0,
47+
"max-len": 0,
48+
"no-await-in-loop": 0,
49+
"no-bitwise": 0,
50+
"no-console": 0,
51+
"no-restricted-syntax": 0,
52+
"no-shadow": 0,
53+
"no-continue": 0,
54+
"no-void": 0,
55+
"no-underscore-dangle": 0,
56+
"no-use-before-define": 0,
57+
"no-useless-constructor": 0,
58+
"no-return-await": 0,
59+
"consistent-return": 0,
60+
"no-else-return": 0,
61+
"func-names": 0,
62+
"no-lonely-if": 0,
63+
"prefer-rest-params": 0,
64+
"new-cap": ["error", { properties: false, capIsNew: false }],
65+
},
66+
};
+7
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
index.cjs
2+
index.js
3+
index.d.ts
4+
index.d.cts
5+
node_modules
6+
dist
7+
.yarn
+19
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
{
2+
"$schema": "https://json.schemastore.org/prettierrc",
3+
"printWidth": 80,
4+
"tabWidth": 2,
5+
"useTabs": false,
6+
"semi": true,
7+
"singleQuote": false,
8+
"quoteProps": "as-needed",
9+
"jsxSingleQuote": false,
10+
"trailingComma": "es5",
11+
"bracketSpacing": true,
12+
"arrowParens": "always",
13+
"requirePragma": false,
14+
"insertPragma": false,
15+
"proseWrap": "preserve",
16+
"htmlWhitespaceSensitivity": "css",
17+
"vueIndentScriptAndStyle": false,
18+
"endOfLine": "lf"
19+
}

0 commit comments

Comments
 (0)