Langchain - Can't solve the dynamic filtering problem from vectorstore

945 Views Asked by At

I am using Langchain version 0.218, and was wondering if anyone was able to filter a seeded vectorstore dynamically during runtime? Such as when running by a Agent.

My motive is to put this dynamic filter in a Conversational Retrieval QA chain, where I filter a retriever with a filename extracted from conversation inputs and retrieve all its chunks (k set to count of chunks belonging to the filename in search_kwargs using a mapper file).

I am able to filter a seeded vectorstore (like Chroma) manually such as:

from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory

# init a vectorstore and seed documents
vectorstore = Chroma.from_documents(..)

# 'somehow' I get hands on the filename from user input or chat history
found_filename = "report.pdf"

# filter using a search arg, such as 'filename' provided in the metadata of all chunks
file_chunk_mapper = {"report.pdf" : ["chunk1", "chunk2", ... ]
one_doc_retiever = vectorstore.as_retriever(search_kwargs={"where" : {"filename": found_filename}, 'k': len(file_chunk_mapper})

# QA Chain which will be used as a Tool by Agents
QA_chain = ConversationalRetrievalChain(.., retriever=one_doc_retiever, memory=memory)

# this would be run by an Agent
QA_chain.run("all person names in file report")

## ANSWER
## I found all the names like: ...

I have tried using no-filters and other methods such as Self-Query Retrieval and Compression Query Retrievals, but none worked like this, when the model had a specific and definite set of chunks to look at.

As far as I have read the documentation, I think creating a CustomChain, with two chains, where first extracts the filename, filters a retriever and then executes a second chain with that new retriever seems to the only option.

Am I missing something here? Is there a simpler or smarter way about this?

But how do I use it in a Agent Execution where chains are automated. Its boggling my mind from past two days.

0

There are 0 best solutions below