How to index llamaindex chat engine into a web application?

372 Views Asked by At

I have been trying to create a RAG based web application using fastapi. The tool used for performing the RAG is llamaindex along with gpt-4 as the LLM. Chat Engines from llamaindex can persist history. But in a web application deployed in a container, the chat messages will be sent by different users. So how can I persist the chat history for each user separately?

The application will be deployed in a Kubernetes cluster. So it cannot be guaranteed that the same pod will serve all the messages from a particular user. In this case how will I preserve the history of messages? Is there any support offered by llamaindex or should I create some middleware to store history for each user and send it along with the current message?

0

There are 0 best solutions below