How to return source documents when using LangChain Expression Language (LCEL)?

752 Views Asked by At

Most samples of using LangChain's Expression Language (LCEL) look like this:

chain = setup_and_retrieval | prompt | model | output_parser

How can I access the source_documents in a RAG application when using this expression language?

1

There are 1 best solutions below

0
On BEST ANSWER

This works well for me:

rag_chain = (
    RunnablePassthrough.assign(source_documents=condense_question | retriever)
    | RunnablePassthrough.assign(context=lambda inputs: format_docs(inputs["source_documents"]) if inputs["source_documents"] else "")
    | RunnablePassthrough.assign(prompt=qa_prompt)
    | RunnablePassthrough.assign(response=lambda inputs: llm(inputs["prompt"].messages))
)

It's called like this:

response_dict = rag_chain.invoke({"question": question, "chat_history": chat_history})
ai_msg = response_dict["response"]
source_documents = response_dict["source_documents"]

The way that helped me understand how to do it was this:

  1. You initially pass a dictionary into the chain (in my case with the keys question and chat_history).
  2. Every time you use RunnablePassthrough.assign, you can ADD stuff to that dictionary and then pass that on to the next step.
  3. RunnablePassthrough.assign always RETURNS a dictionary.

This is what happens in my code example:

  1. We use RunnablePassthrough.assign to add a new source_documents key to the dictionary. Its value is the result of calling the condense_question function (defined elsewhere) that builds and returns a condenser chain. Its condensed result is passed into our retriever (also defined elsewhere).
  2. We use RunnablePassthrough.assign to add a new context key to the dictionary. Its value is the result of calling a format_docs method (defined elsewhere) that combines the source_documents into a single context string.
  3. We use RunnablePassthrough.assign to add a new prompt key to the dictionary. Its value is the result of calling qa_prompt, which is defined as qa_prompt = ChatPromptTemplate.from_messages(...).
  4. We use RunnablePassthrough.assign one more time to add a new response key to the dictionary. Its value is the result of actually calling the llm with the messages from our prompt.
  5. The chain returns a dictionary with all the keys we've added along the way. The response key contains the LLM's response as an AIMessage, and the source_documents key contains the source documents.

I'm sure this can be done in a more concise way, but this worked for me and I can understand it :)