I am having issues finding the correct method of evaluating a response from OpenAI and LlamaIndex. I am using Streamlit and LlamaIndex to create a gpt-3.5 RAG built from blog posts. I am now trying to determine whether a blog post has been used to generate the response and determine specifically which one. I am currently using RelevancyEvaluator to do this. By using '''evaluator.evaluate()''' I hope to pass back whether an article has been used (and later to tell me what article). However, when I do this it does not work as intended. The first time I send a message to ChatGPT it works, and it tells me whether a document has been used. However, the second message I send causes the system to time out. Specifically, I get the response from ChatGPT, but the '''evaluator.evaluate()''' causes a time-out.
I have tried:
- I have tried using '''index.as_chat_engine()''' instead of '''index.as_query_engine''', but the same behaviour occurs
- I have tried using prompt engineering, but this hallucinates some answers.
- I have checked to ensure I am not hitting any rate limits within OpenAI (I am not on the basic version where you only get 3 calls a minute).
I have attached a slightly redacted and reduced version of the code below - it follows very closely the tutorials that LlamaIndex provides
@st.cache_resource(show_spinner=False)
def load_data():
with st.spinner(text="Loading and indexing knowledge – hang tight! This should take 1-2 minutes."):
reader = SimpleDirectoryReader(input_dir="./data", recursive=True)
docs = reader.load_data()
service_context = ServiceContext.from_defaults(llm=OpenAI(model="gpt-3.5-turbo", temperature=0.5, system_prompt="...."))
index = VectorStoreIndex.from_documents(docs, service_context=service_context)
return index, service_context
index, service_context = load_data()
chat_engine = index.as_query_engine()
if prompt := st.chat_input("Your question"): # Prompt for user input and save to chat history
st.session_state.messages.append({"role": "user", "content": prompt})
for message in st.session_state.messages: # Display the prior chat messages
with st.chat_message(message["role"]):
st.write(message["content"])
if st.session_state.messages[-1]["role"] != "assistant":
with st.chat_message("assistant", avatar=assistant_img):
with st.spinner("Thinking..."):
evaluator = RelevancyEvaluator(service_context=service_context)
response = chat_engine.query(prompt)
st.write(response.response)
response_str = response.response
for source_node in response.source_nodes:
eval_result = evaluator.evaluate(
query=prompt, response=response_str, contexts=[source_node.get_content()]
)
print("RESULT")
print(str(eval_result.passing))
print(eval_result.feedback)
message = {"role": "assistant", "content": response.response}
st.session_state.messages.append(message) # Add response to message history
If anyone could provide any feedback why this behaviour is occurring, or how I can fix my problem, I would be very grateful!
I encountered the same issue as you did, but I managed to solve it by adding the following code at the top of my program.