Skip to content

feat(community): Update Voyage embeddings parameters #7689

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 18, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions docs/core_docs/docs/integrations/text_embedding/voyageai.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,23 @@ The `inputType` parameter allows you to specify the type of input text for bette
- `document`: Use this for documents or content that you want to be retrievable. Voyage AI will prepend a prompt to optimize the embeddings for document use cases.
- `None` (default): The input text will be directly encoded without any additional prompt.

Additionally, the class supports new parameters for further customization of the embedding process:

- **truncation**: Whether to truncate the input texts to the maximum length allowed by the model.
- **outputDimension**: The desired dimension of the output embeddings.
- **outputDtype**: The data type of the output embeddings. Can be `"float"` or `"int8"`.
- **encodingFormat**: The format of the output embeddings. Can be `"float"`, `"base64"`, or `"ubinary"`.

```typescript
import { VoyageEmbeddings } from "@langchain/community/embeddings/voyage";

const embeddings = new VoyageEmbeddings({
apiKey: "YOUR-API-KEY", // In Node.js defaults to process.env.VOYAGEAI_API_KEY
inputType: "document", // Optional: specify input type as 'query', 'document', or omit for None / Undefined / Null
truncation: true, // Optional: enable truncation of input texts
outputDimension: 768, // Optional: set desired output embedding dimension
outputDtype: "float", // Optional: set output data type ("float" or "int8")
encodingFormat: "float", // Optional: set output encoding format ("float", "base64", or "ubinary")
});
```

Expand Down
63 changes: 61 additions & 2 deletions libs/langchain-community/src/embeddings/voyage.ts
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,26 @@ export interface VoyageEmbeddingsParams extends EmbeddingsParams {
* Input type for the embeddings request.
*/
inputType?: string;

/**
* Whether to truncate the input texts to the maximum length allowed by the model.
*/
truncation?: boolean;

/**
* The desired dimension of the output embeddings.
*/
outputDimension?: number;

/**
* The data type of the output embeddings. Can be "float" or "int8".
*/
outputDtype?: string;

/**
* The format of the output embeddings. Can be "float", "base64", or "ubinary".
*/
encodingFormat?: string;
}

/**
Expand All @@ -42,6 +62,26 @@ export interface CreateVoyageEmbeddingRequest {
* Input type for the embeddings request.
*/
input_type?: string;

/**
* Whether to truncate the input texts.
*/
truncation?: boolean;

/**
* The desired dimension of the output embeddings.
*/
output_dimension?: number;

/**
* The data type of the output embeddings.
*/
output_dtype?: string;

/**
* The format of the output embeddings.
*/
encoding_format?: string;
}

/**
Expand All @@ -65,6 +105,14 @@ export class VoyageEmbeddings

inputType?: string;

truncation?: boolean;

outputDimension?: number;

outputDtype?: string;

encodingFormat?: string;

/**
* Constructor for the VoyageEmbeddings class.
* @param fields - An optional object with properties to configure the instance.
Expand All @@ -73,7 +121,7 @@ export class VoyageEmbeddings
fields?: Partial<VoyageEmbeddingsParams> & {
verbose?: boolean;
apiKey?: string;
inputType?: string; // Make inputType optional
inputType?: string;
}
) {
const fieldsWithDefaults = { ...fields };
Expand All @@ -92,6 +140,10 @@ export class VoyageEmbeddings
this.apiKey = apiKey;
this.apiUrl = `${this.basePath}/embeddings`;
this.inputType = fieldsWithDefaults?.inputType;
this.truncation = fieldsWithDefaults?.truncation;
this.outputDimension = fieldsWithDefaults?.outputDimension;
this.outputDtype = fieldsWithDefaults?.outputDtype;
this.encodingFormat = fieldsWithDefaults?.encodingFormat;
}

/**
Expand All @@ -107,6 +159,10 @@ export class VoyageEmbeddings
model: this.modelName,
input: batch,
input_type: this.inputType,
truncation: this.truncation,
output_dimension: this.outputDimension,
output_dtype: this.outputDtype,
encoding_format: this.encodingFormat,
})
);

Expand Down Expand Up @@ -135,6 +191,10 @@ export class VoyageEmbeddings
model: this.modelName,
input: text,
input_type: this.inputType,
truncation: this.truncation,
output_dimension: this.outputDimension,
output_dtype: this.outputDtype,
encoding_format: this.encodingFormat,
});

return data[0].embedding;
Expand All @@ -145,7 +205,6 @@ export class VoyageEmbeddings
* @param request - An object with properties to configure the request.
* @returns A Promise that resolves to the response from the Voyage AI API.
*/

private async embeddingWithRetry(request: CreateVoyageEmbeddingRequest) {
const makeCompletionRequest = async () => {
const url = `${this.apiUrl}`;
Expand Down
Loading