How to detect whether ConversationalRetrievalChain called the OpenAI LLM?

Question

How to detect whether ConversationalRetrievalChain called the OpenAI LLM?

1k Views Asked by AngryHacker At 05 June 2025 at 08:38

I have the following code:

chat_history = []
embeddings = OpenAIEmbeddings()
db = FAISS.from_documents(chunks, embeddings)
qa = ConversationalRetrievalChain.from_llm(OpenAI(temperature=0.1), db.as_retriever())
result = qa({"question": "What is stack overflow", "chat_history": chat_history})

The code creates embeddings, creates a FAISS in-memory vector db with some text that I have in chunks array, then it creates a ConversationalRetrievalChain, followed by asking a question.

Based on what I understand from ConversationalRetrievalChain, when asked a question, it will first query the FAISS vector db, then, if it can't find anything matching, it will go to OpenAI to answer that question. (is my understanding correct?)

How can I detect if it actually called OpenAI to get the answer or it was able to get it from the in-memory vector DB? The result object contains question, chat_history and answer properties and nothing else.

Original Q&A

There are 4 best solutions below

Idcore On 01 August 2023 at 20:27

I personaly don't think ConversationalRetrievalChain could get you any answer from document without sending api request to OpenAI in provided example. But I'm not expert with it, I could be wrong.

But you could use another cheaper/local llm as a way to condense final question to help optimize token count.

Here is their example:

qa = ConversationalRetrievalChain.from_llm(
    ChatOpenAI(temperature=0, model="gpt-4"),
    vectorstore.as_retriever(),
    condense_question_llm = ChatOpenAI(temperature=0, model='gpt-3.5-turbo'),
)

A way one could trace usage of API is as follows:

from langchain.callbacks import get_openai_callback
with get_openai_callback() as cb:
    result = llm("Tell me a joke")
    print(cb)

Tokens Used: 42 Prompt Tokens: 4 Completion Tokens: 38 Successful Requests: 1 Total Cost (USD): $0.00084

The other usefull way is to use additional tool to trace requests: https://github.com/amosjyng/langchain-visualizer

Enxhi Ismaili On 01 August 2023 at 20:58

You can detect if the answer was obtained from the in-memory vector database by checking if the "answer" property exists and is not empty in the result object. If it's present, the answer came from the database; otherwise, it was generated by the OpenAI model.

DSgUY On 01 August 2023 at 21:01

Hi you can apply for https://smith.langchain.com/ to visual tracking of the ConversationalRetrievalChain

See the image:

Here I'm using AzureChatOpenAI. The first call of the LLMChain is for "Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

The second call is for your specific prompt or the langchain default prompt.

In addition you can set verbose=True on ConversationalRetrievalChain.from_llm to see the what is happening.

Hope it helps. Regards.

**Cehao Yang** · Accepted Answer

"Based on what I understand from ConversationalRetrievalChain, when asked a question, it will first query the FAISS vector db, then, if it can't find anything matching, it will go to OpenAI to answer that question."

This part is not correct. Each time ConversationalRetrievalChain receives your query in conversation, it will rephrase the question, and retrieves documents from your vector store(It is FAISS in your case), and returns answers generated by LLMs(It is OpenAI in your case). Meaning that ConversationalRetrievalChain is the conversation version of RetrievalQA.

How to detect whether ConversationalRetrievalChain called the OpenAI LLM?

There are 4 best solutions below

Related Questions in PYTHON

Related Questions in OPENAI-API

Related Questions in LANGCHAIN

Related Questions in LARGE-LANGUAGE-MODEL

Related Questions in CONVERSATIONAL-AI

Trending Questions

Popular # Hahtags

Popular Questions