Azure AI Search sample index for .msg email files

90 Views Asked by At

Does anyone know how to create Azure AI Search indexes for email .msg files?

I have been able to find sample indexes for JSON content but can't seem to find samples that index email content.

I would like to be able to create an index based on the common email properties: From, To, CC, Subject, Sent Date, and body.

I believe it would be something like:

    {
        "name": "email-index",  
        "fields": [
            {"name": "From", "type": "Edm.String", "key": true, "filterable": true},
            {"name": "To", "type": "Collection(Edm.String)",
                "fields": [
                "address1",
                "address2"
                ]
            }
            {"name": "CC", "type": "Collection(Edm.String)",
                "fields": [
                "address1",
                "address2"
                ]
            }
            {"name": "BCC", "type": "Collection(Edm.String)",
                "fields": [
                "address1",
                "address2"
                ]
            }
            {"name": "DateSent", "type": "Edm.DateTimeOffset", "searchable": true, "filterable": false, "sortable": false, "facetable": false, "analyzer": "en.lucene"},
            {"name": "Body", "type": "Edm.String", "searchable": true, "filterable": true, "sortable": true, "facetable": true},
        ]
    }

I can't fin samples for the .msg email fields to construct the index.

2

There are 2 best solutions below

4
Sampath On

The below code is for creating or updating or searching email data an index in Azure Cognitive Search using the Azure SDK for Python.

import sys
import json
from azure.core.credentials import AzureKeyCredential
from azure.search.documents import SearchClient
from azure.search.documents.indexes import SearchIndexClient
from azure.search.documents.indexes.models import SearchIndex, SimpleField, SearchableField

# Azure Search service endpoint and admin key
service_name = "YOUR_SEARCH_SERVICE_NAME"
admin_key = "YOUR_SEARCH_SERVICE_ADMIN_API_KEY"
endpoint = f"https://{service_name}.search.windows.net/"

# Index name and schema
index_name = "email-index2"

# Define the fields for the schema
fields = [
    SimpleField(name="EmailId", type="Edm.String", key=True, searchable=True),
    SimpleField(name="From", type="Edm.String", searchable=True, filterable=True),
    SimpleField(name="To", type="Collection(Edm.String)", searchable=True, filterable=True),
    SimpleField(name="CC", type="Collection(Edm.String)", searchable=True, filterable=True),
    SimpleField(name="BCC", type="Collection(Edm.String)", searchable=True, filterable=True),
    SimpleField(name="DateSent", type="Edm.String", searchable=True, filterable=True),
    SimpleField(name="Subject", type="Edm.String", searchable=True, filterable=True),
    SimpleField(name="Body", type="Edm.String", searchable=True, filterable=True)
]

# Instantiate the SearchIndex object with the defined fields
index = SearchIndex(name=index_name, fields=fields)

# Instantiate the SearchIndexClient
credential = AzureKeyCredential(admin_key)
index_client = SearchIndexClient(endpoint=endpoint, credential=credential)

# Create or update the index
try:
    index_client.create_or_update_index(index=index)
    print(f"Index '{index_name}' created or updated successfully.")
except Exception as e:
    print(f"An error occurred: {e}")

enter image description here Supported data types of Azure AI Search

 {
    "name": "email-index",  
    "fields": [
        {"name": "EmailId", "type": "Edm.String", "key": true, "searchable": true},
        {"name": "From", "type": "Edm.String", "searchable": true, "filterable": true},
        {"name": "To", "type": "Collection(Edm.String)", "searchable": true, "filterable": true},
        {"name": "CC", "type": "Collection(Edm.String)", "searchable": true, "filterable": true},
        {"name": "BCC", "type": "Collection(Edm.String)", "searchable": true, "filterable": true},
        {"name": "DateSent", "type": "Edm.String", "searchable": true, "filterable": true},
        {"name": "Subject", "type": "Edm.String", "searchable": true, "filterable": true},
        {"name": "Body", "type": "Edm.String", "searchable": true, "filterable": true}
    ]
}

enter image description here

enter image description here

  • Azure Files indexer with Azure AI Search

In azure portal :

enter image description here

  • Indexing file contents and metadata in Azure Cognitive Search.
1
David On

I was able to create an index and indexer that allows me to query based on the following fields: metadata_content_type metadata_message_from metadata_message_from_email metadata_message_to metadata_message_to_email metadata_message_cc metadata_message_cc_email metadata_message_bcc metadata_message_bcc_email metadata_creation_date metadata_last_modified metadata_subject

https://learn.microsoft.com/en-us/azure/search/search-blob-metadata-properties