How to use Python to perform vector search or hybrid search on Azure AI Search?

280 Views Asked by At

As title

My setup is as follows: I select "Import and vectorize data" on the Azure AI Search Portal and I get an index with vector values. I am used to using python for Azure AI Search.

Python code is as follow;

credential = AzureKeyCredential(key)
search_client = SearchClient(
    endpoint=endpoint,
    index_name=index_name,
    credential=credential
)

text=input("Qes:")
results=search_client.search(search_text=text,select="title")

for ans in results:
    print(ans)

How do I perform a vector search or hybrid search in python under this situation?

1

There are 1 best solutions below

0
Dasari Kamali On BEST ANSWER

Posting my comments as an answer is a benefit for the community.

You can check this Github link with the steps below to perform a Vector search:

  1. Generate Embeddings: Start by reading your data and generating embeddings using OpenAI. Once generated, export these embeddings into a format suitable for insertion into your Azure AI Search index.

  2. Set Up Search Index: Create the schema for your search index and configure vector search settings according to your requirements.

  3. Add Text and Embeddings to Index: Populate your vector store with the text data and corresponding metadata from your JSON dataset.

  4. Conduct Vector Similarity Search: Utilize the provided code to perform a vector similarity search. Simply provide the text query, and the vectorizer will handle the vectorization of the query automatically.

from azure.search.documents.models import VectorizedQuery

query = "tools for software development"  
  
embedding = client.embeddings.create(input=query, model=embedding_model_name).data[0].embedding
vector_query = VectorizedQuery(vector=embedding, k_nearest_neighbors=3, fields="contentVector")
  
results = search_client.search(  
    search_text=None,  
    vector_queries= [vector_query],
    select=["title", "content", "category"],
)  
  
for result in results:  
    print(f"Title: {result['title']}")  
    print(f"Score: {result['@search.score']}")  
    print(f"Content: {result['content']}")  
    print(f"Category: {result['category']}\n") 

Below is the code for Hybrid Search:

query = "scalable storage solution"  
  
embedding = client.embeddings.create(input=query, model=embedding_model_name).data[0].embedding
vector_query = VectorizedQuery(vector=embedding, k_nearest_neighbors=3, fields="contentVector")

results = search_client.search(  
    search_text=query,  
    vector_queries=[vector_query],
    select=["title", "content", "category"],
    top=3
)  
  
for result in results:  
    print(f"Title: {result['title']}")  
    print(f"Score: {result['@search.score']}")  
    print(f"Content: {result['content']}")  
    print(f"Category: {result['category']}\n")