This is the error I get for my LLM application-
An error occurred: Expected EmbeddingFunction.call to have the following signature: odict_keys(['self', 'input']), got odict_keys(['self', 'args', 'kwargs']) Please see https://docs.trychroma.com/embeddings for details of the EmbeddingFunction interface. Please note the recent change to the EmbeddingFunction interface: https://docs.trychroma.com/migration#migration-to-0416---november-7-2023
This is main part of the code -
try:
start = timeit.default_timer()
config = {
'max_new_tokens': 1024,
'repetition_penalty': 1.1,
'temperature': 0.1,
'top_k': 50,
'top_p': 0.9,
'stream': True,
'threads': int(os.cpu_count() / 2)
}
llm = CTransformers(
model = "TheBloke/zephyr-7B-beta-GGUF",
model_file = "zephyr-7b-beta.Q4_0.gguf",
model_type="mistral",
lib="avx2", #for CPU use
**config
)
st.write("LLM Initialized:")
model_name = "BAAI/bge-large-en"
model_kwargs = {'device': 'cpu'}
encode_kwargs = {'normalize_embeddings': False}
embeddings = HuggingFaceBgeEmbeddings(
model_name=model_name,
model_kwargs=model_kwargs,
encode_kwargs=encode_kwargs
)
# embeddings = HuggingFaceInstructEmbeddings(model_name="hkunlp/instructor-xl",
# model_kwargs={"device": "cpu"})
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=30, length_function = len)
chunked_documents = text_splitter.split_documents(loaded_documents)
persist_directory = 'db'
# Create and persist a Chroma vector database from the chunked documents
db=Chroma.from_documents(documents=chunked_documents,embedding=embeddings,persist_directory=persist_directory)
db.persist()
retriever = db.as_retriever(search_kwargs={"k":1})
qa = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=retriever, return_source_documents=True, verbose=True)
bot_response = qa(query)
lines = bot_response['result'].split('\n')
wrapped_lines = [textwrap.fill(line, width=width) for line in lines]
wrapped_text = '\n'.join(wrapped_lines)
for source in bot_response["source_documents"]:
sources = source.metadata['source']
end = timeit.default_timer()
st.write("Elapsed time:")
st.write(end - start)
st.write("Bot Response:")
st.write(wrapped_text)
st.write(sources)
except Exception as e:
st.error(f"An error occurred: {str(e)}")
I tried changing the embbeding lioke-
embeddings = HuggingFaceInstructEmbeddings(model_name="hkunlp/instructor-xl",
# model_kwargs={"device": "cpu"})
but still the same error. Please help what is the problem here. Here is the link to the full code - https://huggingface.co/spaces/captain-awesome/Docuverse-zephyr-beta/blob/main/app.py