Streaming a response with LangChain in JavaScript

2.3k Views Asked by At

I am writing a little application in JavaScript using the LangChain library. I have the following snippet:

/* LangChain Imports */
import { OpenAI } from "langchain/llms/openai";
import { BufferMemory } from "langchain/memory";
import { ConversationChain } from "langchain/chains";

// ========================================================================================= //
  // ============= Use LangChain to send request to OpenAi API =============================== //
  // ========================================================================================= //

  const openAILLMOptions = {
    modelName: chatModel.value,
    openAIApiKey: decryptedString,
    temperature: parseFloat(temperatureValue.value),
    topP: parseFloat(topP.value),
    maxTokens: parseInt(maxTokens.value),
    stop: stopSequences.value.length > 0 ? stopSequences.value : null,
    streaming: true,
};

  const model = new OpenAI(openAILLMOptions);
  const memory = new BufferMemory();
  const chain = new ConversationChain({ llm: model, memory: memory });

  try {
    const response = await chain.call({ input: content.value, signal: signal }, undefined,
      [
        {

          handleLLMNewToken(token) {
            process.stdout.write(token);
          },
        },
      ]
    );

// handle the response

}

This does not work (I tried both using the token via TypeScript and without typing). I have scoured various forums and they are either implementing streaming with Python or their solution is not relevant to this problem. So to summarize, I can successfully pull the response from OpenAI via the LangChain ConversationChain() API call, but I can’t stream the response. Is there a solution?

2

There are 2 best solutions below

2
On BEST ANSWER

For reference here is how I got streaming working:

const openAILLMOptions = {
    modelName: chatModel.value,
    cache: true,
    openAIApiKey: openAIDecryptedString,
    temperature: parseFloat(temperatureValue.value),
    topP: parseFloat(topP.value),
    maxTokens: parseInt(maxTokens.value),
    stop: stopSequences.value.length > 0 ? stopSequences.value : null,
    streaming: true,
    verbose: true,
  };

const chat = new ChatOpenAI(openAILLMOptions);
      const chatPrompt = ChatPromptTemplate.fromMessages([
        [
          "system",
          systemPrompt.value,
        ],
        new MessagesPlaceholder("history"),
        ["human", content.value],
      ]);

      const chain = new ConversationChain({
        memory: new BufferMemory({ returnMessages: true, memoryKey: "history" }),
        prompt: chatPrompt,
        llm: chat,
      });

      await chain.call({
        input: content.value,
        signal: signal,
        callbacks: [
          {
            handleLLMNewToken(token) {
              aiResponse.value += token;
            }
          }
        ]

      });
0
On

You can pack all in a async generator function like this:

async function* chat(input, sessionId) {
  let id = 0;
  const { chain } = getChain(sessionId); // get chain from cache
  const emitter = new EventEmitter();
  chain.call({
    input,
    callbacks: [
      {
        handleLLMNewToken(data) {
          emitter.emit("data", { data, id: id++, event: "stream" });
        },
        handleLLMEnd(data) {
          emitter.emit("data", { event: "end" });
        },
      },
    ],
  });

  for await (const [data] of on(emitter, "data")) {
    if (data.event === "end") return;
    yield data;
  }
}