Skip to content

Patch for using LCEL to stream from LLM #5873

Closed
@welljsjs

Description

@welljsjs

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain.js documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain.js rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

const loader = new TextLoader("data/example2.txt");
  const docs = await loader.load();
  const textSplitter = new RecursiveCharacterTextSplitter({
    chunkSize: 400,
    chunkOverlap: 200,
    separators: ["\n\n", "\n", "."],
    keepSeparator: false,
  });

  const simpleHash = (str: string) => {
    let hash = 0;
    for (let i = 0; i < str.length; i++) {
      const char = str.charCodeAt(i);
      hash = (hash << 5) - hash + char;
    }
    // Convert to 32bit unsigned integer in base 36 and pad with "0" to ensure length is 7.
    return (hash >>> 0).toString(36).padStart(7, "0");
  };

  const splits = await textSplitter.splitDocuments(docs);

  const embeddings = new HuggingFaceTransformersEmbeddings({
    model: "Xenova/all-MiniLM-L6-v2", 
  });

  const vectorStore = new Chroma(embeddings, {
    collectionName: config.chroma.collectionName,
    url: `http://${config.chroma.host}:${config.chroma.port}`,
    collectionMetadata: {
      "hnsw:space": "cosine",
    }, 
  });
  vectorStore.ensureCollection();

  splits.forEach(
    (split) => (split.metadata.id = simpleHash(split.pageContent))
  );
  vectorStore.addDocuments(splits, {
    ids: splits.map((split) => split.metadata.id),
  });

  // Retrieve and generate using the relevant snippets of the blog.
  const retriever = vectorStore.asRetriever({ k: 5, searchType: "similarity" });
  const prompt = HumanMessagePromptTemplate.fromTemplate(
    `You're a virtual assistant to help with technical queries. Answer the following question referring to the given context below.


    Context: {context}
    

    Question: {question}`
  );

  const model = new HuggingFaceInference({
    model: "mistralai/Mistral-7B-Instruct-v0.3",
    apiKey: config.huggingface.apiToken,
    maxRetries: 1,
    maxTokens: config.huggingface.llmMaxTokens,
    verbose: true,
  });

  const outputParser = new StringOutputParser();

  const mergeDocsToString = (docs: Document<Record<string, any>>[]) =>
    docs.map((doc) => doc.pageContent).join("\n\n");

  const setupAndRetrieval = RunnableMap.from({
    context: new RunnableLambda({
      func: (input: string) => retriever.invoke(input).then(mergeDocsToString),
    }).withConfig({ runName: "contextRetriever" }),
    question: new RunnablePassthrough(),
  });

  // Construct our RAG chain.
  const chain = setupAndRetrieval.pipe(prompt).pipe(model).pipe(outputParser);
  const s = await chain.stream("Hello, why is too much sugar bad for the human body?");
  console.log(s);

Error Message and Stack Trace (if applicable)

No response

Description

I'm trying to use LangChainJS to set up a simple RAG pipeline with an LLM. I'm trying to call "stream" on the chain rather than "invoke" to stream the output of the LLM.

Expected behaviour is that the final outputs of calling invoke and stream on the chain are the same, given that the input is the same.

Instead, what is actually happening is that the output differs. The output when calling stream on the chain does not seem to make any sense at all and is not related to the input. This is due to a bug in the llms.js file. I've managed to fix the problem, see the patch below. Within the implementation of the "async *_streamIterator" method of the BaseLLM class, line 65, the prompt value should be passed as first argument to "this._streamResponseChunks". However, instead of passing the prompt value as a string (which would be "prompt.toString()"), "input.toString()" is passed, which results in a bug since "input.toString()" always evaluates to the string "[object Object]", not the content of the prompt.

Note that my patch works for the generated JS, not the TS.
For TS, this line: https://github.com/langchain-ai/langchainjs/blob/b311ec5c19cd4ab7aad116e81fb1ea33c5d71a8d/langchain-core/src/language_models/llms.ts#L159C11-L159C25 should be changed to prompt.toString() instead of input.toString().

diff --git a/node_modules/@langchain/core/dist/language_models/llms.js b/node_modules/@langchain/core/dist/language_models/llms.js
index 70466ae..06e0349 100644
--- a/node_modules/@langchain/core/dist/language_models/llms.js
+++ b/node_modules/@langchain/core/dist/language_models/llms.js
@@ -62,7 +62,7 @@ export class BaseLLM extends BaseLanguageModel {
                 text: "",
             });
             try {
-                for await (const chunk of this._streamResponseChunks(input.toString(), callOptions, runManagers?.[0])) {
+                for await (const chunk of this._streamResponseChunks(prompt.toString(), callOptions, runManagers?.[0])) {
                     if (!generation) {
                         generation = chunk;
                     }

System Info

ProductName: macOS
ProductVersion: 12.7.5
BuildVersion: 21H1222
NodeVersion: v22.3.0
NPMVersion: 10.8.1
LangChainVersion: 0.2.6

Metadata

Metadata

Assignees

No one assigned

    Labels

    auto:bugRelated to a bug, vulnerability, unexpected error with an existing feature

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions