Issue with qdrant collection creation | Not sure about the format which input format support filter too

162 Views Asked by At

this is my sample input dataframe

data = {
'name': ['Entry 1', 'Entry 2', 'Entry 3'],
'urls': ['http://example.com/1', 'http://example.com/2', 'http://example.com/3'],
'text': ['Text for Entry 1', 'Text for Entry 2', 'Text for Entry 3'],
'type': ['Type A', 'Type B', 'Type C']}

I want to index it on qdrant cloud and for that, I have tried below long-chain code following qdrant documentation

from langchain.vectorstores import Qdrant
texts = data["text"].tolist()
model_name = "sentence-transformers/sentence-t5-base"

embeddings = HuggingFaceEmbeddings(
model_name=model_name)
model_kwargs = {'device': 'cpu'}
encode_kwargs = {'normalize_embeddings': False}
doc_store = Qdrant.from_texts(
texts, 
embeddings, 
url=qdrant_url, 
api_key=qdrant_key, 
collection_name="my-collection"
)

This method, it's not store page metadata and vectors in the cloud, meanwhile, I was following this official documentation: https://qdrant.tech/documentation/frameworks/langchain/

this is the way it's stored in the cloud enter image description here

you can see blank metadata and vectors, can someone please help me here, I find not much support in langchain

2

There are 2 best solutions below

0
Kacper Łukawski On

You are not passing the metadata to Qdrant.from_texts, but just the texts and embeddings. It should be fine if you build the metadata objects and pass them as a metadatas parameter to the Qdrant.from_texts call.

By the way, vectors are not displayed in the UI, but it doesn't mean they're not stored. Vectors are rarely used after the semantic search, so the API does not return them to avoid network overhead.

0
j3ffyang On

You can directly embed the data, which is in {dict} in Qdrant VectorStore

from langchain.embeddings import HuggingFaceEmbeddings
embeddings = HuggingFaceEmbeddings(
    model_name = "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2")

doc_store = Qdrant.from_texts(
    ["data"],
    embeddings,
    # url=qdrant_url,
    # api_key=qdrant_key,
    location=":memory:",  # Run Qdrant locally
    collection_name="my-collection",
)