A TypeScript library for interacting with models across providers with standardized interfaces and tool calling.
Related projects:
- llm-info: Information on LLM models, context window token limit, output token limit, pricing and more.
- ai-file-edit: A library for editing files using AI models.
- model-quirks: Quirks, edge cases, and interesting bits of various models.
- 16x Eval: Your personal workspace for prompt engineering and model evaluation.
- 🔄 Unified interface with comprehensive model support:
- First-party providers (OpenAI, Anthropic, Google, DeepSeek)
- Third-party providers (Vertex AI, Azure OpenAI, OpenRouter, Fireworks)
- Custom providers with OpenAI-compatible API
- đź”§ Supports function calling and system prompt
- 📝 Standardized message format and response structure
- 🛠️ Full TypeScript support for type safety
- 🎯 No additional dependencies for each provider
- 🛡️ Handles all edge cases (message format, function calling, multi-round conversations)
- 🎨 Provider specific options (headers, reasoning extraction)
- 🖼️ Support for image input in messages (base64 and URL formats)
- ⚡ Streaming support for real-time responses across all providers
import { sendPrompt } from "send-prompt";
import { AI_PROVIDERS, ModelEnum } from "llm-info";
const response = await sendPrompt(
{
messages: [
{ role: "user", content: "What's the weather like in Singapore?" },
],
tools: [weatherTool],
},
{
model: ModelEnum["gpt-4.1"],
// model: ModelEnum["claude-3-7-sonnet-20250219"],
// model: ModelEnum["gemini-2.5-pro-exp-03-25"],
provider: AI_PROVIDERS.OPENAI,
// provider: AI_PROVIDERS.ANTHROPIC,
// provider: AI_PROVIDERS.GOOGLE,
apiKey: process.env.API_KEY,
}
);
console.log(response.message.content);
# Install llm-info to get the model and provider information
npm install llm-info
# Install send-prompt to send prompt to models
npm install send-prompt
The same function sendPrompt
works across all providers:
import { sendPrompt } from "send-prompt";
import { AI_PROVIDERS, ModelEnum } from "llm-info";
// OpenAI
const openaiResponse = await sendPrompt(
{
messages: [{ role: "user", content: "Hello, who are you?" }],
systemPrompt: "You are a helpful assistant.",
},
{
model: ModelEnum["gpt-4.1"],
provider: AI_PROVIDERS.OPENAI,
apiKey: "your-openai-api-key",
}
);
// Anthropic
const anthropicResponse = await sendPrompt(
{
messages: [{ role: "user", content: "Hello, who are you?" }],
systemPrompt: "You are a helpful assistant.",
},
{
model: ModelEnum["claude-3-7-sonnet-20250219"],
provider: AI_PROVIDERS.ANTHROPIC,
apiKey: "your-anthropic-api-key",
}
);
// Google
const googleResponse = await sendPrompt(
{
messages: [{ role: "user", content: "Hello, who are you?" }],
systemPrompt: "You are a helpful assistant.",
},
{
model: ModelEnum["gemini-2.5-pro-exp-03-25"],
provider: AI_PROVIDERS.GOOGLE,
apiKey: "your-google-api-key",
}
);
// Google Vertex AI (requires gcloud CLI authentication)
// https://cloud.google.com/vertex-ai/generative-ai/docs/start/quickstarts/quickstart-multimodal
const googleVertexResponse = await sendPrompt(
{
messages: [{ role: "user", content: "Hello, who are you?" }],
systemPrompt: "You are a helpful assistant.",
},
{
model: ModelEnum["gemini-2.5-pro-exp-03-25"],
provider: AI_PROVIDERS.GOOGLE_VERTEX_AI,
vertexai: true,
project: process.env.GOOGLE_CLOUD_PROJECT!, // Your Google Cloud project ID
location: process.env.GOOGLE_CLOUD_LOCATION!, // Your Google Cloud location (e.g., "us-central1")
}
);
// OpenRouter
const openrouterResponse = await sendPrompt(
{
messages: [{ role: "user", content: "Hello, who are you?" }],
systemPrompt: "You are a helpful assistant.",
},
{
customModel: "meta-llama/llama-4-scout:free",
provider: AI_PROVIDERS.OPENROUTER,
apiKey: "your-openrouter-api-key",
headers: {
"HTTP-Referer": "https://eval.16x.engineer/",
"X-Title": "16x Eval",
},
}
);
// Fireworks
const fireworksResponse = await sendPrompt(
{
messages: [{ role: "user", content: "Hello, who are you?" }],
systemPrompt: "You are a helpful assistant.",
},
{
customModel: "accounts/fireworks/models/deepseek-v3-0324",
provider: AI_PROVIDERS.FIREWORKS,
apiKey: "your-fireworks-api-key",
}
);
// DeepSeek
const deepseekResponse = await sendPrompt(
{
messages: [{ role: "user", content: "Hello, who are you?" }],
systemPrompt: "You are a helpful assistant.",
},
{
customModel: "deepseek-chat",
provider: AI_PROVIDERS.DEEPSEEK,
apiKey: "your-deepseek-api-key",
}
);
// Custom Provider
const customResponse = await sendPrompt(
{
messages: [{ role: "user", content: "Hello, who are you?" }],
systemPrompt: "You are a helpful assistant.",
},
{
customModel: "custom-model",
provider: "custom",
baseURL: "https://your-custom-api.com/v1",
apiKey: "your-custom-api-key",
}
);
// All responses have the same structure
console.log(openaiResponse.message.content);
The model
field is an enum of all models supported by the library, this is useful to avoid typos and to get the correct model information.
In case you want to use a model that is not yet available in the enum, you can use customModel
field instead. This is supported for all first party providers (OpenAI, Anthropic, Google).
// Using custom model string for new models
const response = await sendPrompt(
{
messages: [{ role: "user", content: "Hello, who are you?" }],
},
{
customModel: "gpt-4o-mini", // Custom model string
provider: AI_PROVIDERS.OPENAI,
apiKey: "your-openai-api-key",
}
);
Note that the model
and customModel
fields are mutually exclusive.
You can send images to models that support vision capabilities:
const imageResponse = await sendPrompt(
{
messages: [
{
role: "user",
content: [
{ type: "text", text: "What's in this image?" },
{
type: "image_url",
image_url: {
url: "...", // base64 image data
},
},
],
},
],
},
{
model: ModelEnum["gpt-4.1"],
provider: AI_PROVIDERS.OPENAI,
apiKey: "your-openai-api-key",
}
);
// Define your tool
const calculatorTool = {
type: "function",
function: {
name: "calculator",
description: "Perform basic arithmetic calculations",
parameters: {
type: "object",
properties: {
operation: {
type: "string",
enum: ["add", "subtract", "multiply", "divide"],
description: "The arithmetic operation to perform",
},
a: {
type: "number",
description: "First number",
},
b: {
type: "number",
description: "Second number",
},
},
required: ["operation", "a", "b"],
additionalProperties: false,
},
strict: true,
},
};
const response = await sendPrompt(
{
messages: [{ role: "user", content: "What is 5 plus 3?" }],
tools: [calculatorTool],
},
{
model: ModelEnum["gpt-4.1"],
provider: AI_PROVIDERS.OPENAI,
apiKey: "your-openai-api-key",
}
);
// Expected response structure:
// {
// tool_calls: [
// {
// id: "call_123",
// type: "function",
// function: {
// name: "calculator",
// arguments: '{"operation":"add","a":5,"b":3}'
// }
// }
// ]
// }
// Handle the function call
if (response.tool_calls) {
const toolCall = response.tool_calls[0];
console.log("Tool called:", toolCall.function.name);
console.log("Arguments:", JSON.parse(toolCall.function.arguments));
}
You can pass custom headers to providers using the headers
option:
const response = await sendPrompt(
{
messages: [{ role: "user", content: "Hello" }],
},
{
model: ModelEnum["gpt-4.1"],
provider: AI_PROVIDERS.OPENAI,
apiKey: "your-api-key",
headers: {
"Custom-Header": "value",
"X-Title": "My App",
},
}
);
You can control the randomness of the model's responses using the temperature
parameter. Temperature ranges from 0 to 2, where lower values make the output more focused and deterministic, while higher values make it more random and creative:
const response = await sendPrompt(
{
messages: [{ role: "user", content: "Write a creative story" }],
temperature: 0.8, // More creative and random
},
{
model: ModelEnum["gpt-4.1"],
provider: AI_PROVIDERS.OPENAI,
apiKey: "your-api-key",
}
);
// For more deterministic responses
const deterministicResponse = await sendPrompt(
{
messages: [{ role: "user", content: "What is 2 + 2?" }],
temperature: 0.1, // More focused and deterministic
},
{
model: ModelEnum["claude-3-7-sonnet-20250219"],
provider: AI_PROVIDERS.ANTHROPIC,
apiKey: "your-api-key",
}
);
The temperature parameter is supported across all providers (OpenAI, Anthropic, Google, OpenRouter, Fireworks, DeepSeek, Azure OpenAI, and custom providers). If not specified, each provider will use its default temperature value.
For Anthropic models, you can control the maximum number of tokens in the response using the anthropicMaxTokens
option:
const response = await sendPrompt(
{
messages: [{ role: "user", content: "Write a long story" }],
anthropicMaxTokens: 2000, // Limit response to 2000 tokens
},
{
model: ModelEnum["claude-3-7-sonnet-20250219"],
provider: AI_PROVIDERS.ANTHROPIC,
apiKey: "your-anthropic-api-key",
}
);
If not specified, it will use the model's default output token limit or 4096 tokens, whichever is smaller. When using function calling, it will default to 4096 tokens.
For providers that support it (like DeepSeek), you can extract the model's reasoning from the response:
const response = await sendPrompt(
{
messages: [
{ role: "user", content: "Solve this math problem: 2x + 5 = 15" },
],
},
{
model: ModelEnum["deepseek-reasoner"],
provider: AI_PROVIDERS.DEEPSEEK,
apiKey: "your-api-key",
}
);
if (response.reasoning) {
console.log("Model's reasoning:", response.reasoning);
}
You can stream responses from supported providers to get real-time content as it's generated. Streaming is supported for all providers.
const response = await sendPrompt(
{
messages: [{ role: "user", content: "Write a short story about a robot" }],
stream: true,
onStreamingContent: (content: string) => {
// This callback is called for each chunk of content
process.stdout.write(content);
},
},
{
model: ModelEnum["gpt-4o-mini"],
provider: AI_PROVIDERS.OPENAI,
apiKey: "your-openai-api-key",
}
);
// The function still returns the complete response at the end
console.log("\n\nComplete response:", response.message.content);
console.log("Duration:", response.durationMs, "ms");
if (response.usage) {
console.log("Token usage:", response.usage);
}
Streaming Limitations:
- Cannot be used with function calling (
tools
parameter)
The response from sendPrompt
follows a standardized format across all providers:
message
: The main response contenttool_calls
: Any function calls made by the modelreasoning
: The model's reasoning process (if available)usage
: Token usage informationpromptTokens
: Number of tokens in the input messagesthoughtsTokens
: Number of tokens used for reasoning (if available)completionTokens
: Number of tokens in the model's response (includes thoughts tokens)totalTokens
: Total tokens used (includes thoughts tokens)
durationMs
: The time taken by the API call in milliseconds
Example response:
{
message: {
role: "assistant",
content: "I am a helpful assistant."
},
usage: {
completionTokens: 10,
promptTokens: 20,
totalTokens: 30,
thoughtsTokens: 0
},
durationMs: 1234
}
For Google's Gemini models, you can handle multi-round tool calls by including function call and response messages in the conversation:
// First round - model makes a function call
const firstResponse = await sendPrompt(
{
messages: [{ role: "user", content: "What is 15 plus 32?" }],
tools: [calculatorTool],
toolCallMode: "AUTO",
},
{
model: ModelEnum["gemini-2.5-pro-exp-03-25"],
provider: AI_PROVIDERS.GOOGLE,
apiKey: "your-google-api-key",
}
);
// Handle the function call and get the result
if (firstResponse.tool_calls) {
const toolCall = firstResponse.tool_calls[0];
const args = JSON.parse(toolCall.function.arguments);
const result = calculate(args.operation, args.a, args.b); // Your calculation function
// Second round - include function call and response in messages
const secondResponse = await sendPrompt(
{
messages: [
{ role: "user", content: "What is 15 plus 32?" },
{
role: "google_function_call",
id: toolCall.id,
name: toolCall.function.name,
args: args,
},
{
role: "google_function_response",
id: toolCall.id,
name: toolCall.function.name,
response: { result },
},
],
tools: [calculatorTool],
toolCallMode: "AUTO",
},
{
model: ModelEnum["gemini-2.5-pro-exp-03-25"],
provider: AI_PROVIDERS.GOOGLE,
apiKey: "your-google-api-key",
}
);
// The model will now respond with the final answer
console.log("Final response:", secondResponse.message.content);
}
The multi-round tool calling process involves:
- First round: Model makes a function call
- Your code executes the function and gets the result
- Second round: Include both the function call and its response in the messages
- Model provides the final response using the function result
- Support for DeepSeek
- Support for image input
- Support for streaming
- Better error handling