I want to fetch all the assert details under a collection for the backup of the collection in Azure purview

294 Views Asked by At

I am only able to get collection details through API call but it only tells the basic information about the collection. I want to find the assert details lying under the collection

I have tried with the GET API call ({{Endpoint_Uri}}/collections/?api-version=2019-11-01-preview) and tried with Python as well. Below is the code:

from purviewautomation import (ServicePrincipalAuthentication,
                                PurviewCollections)

auth = ServicePrincipalAuthentication(tenant_id="Tenantid",
                                      client_id="client_id",
                                      client_secret="client_secret")

client = PurviewCollections(purview_account_name="account_name",auth=auth)
print(client.list_collections())
1

There are 1 best solutions below

0
Ikhtesam Afrin On

You can use the below Python code to export all the Assets which are there in a collection.

from  azure.purview.catalog  import  PurviewCatalogClient
from  azure.identity  import  ClientSecretCredential
from  azure.core.exceptions  import  HttpResponseError
import  pandas  as  pd 

keywords  =  "*"
export_csv_path  =  "purview_search_export.csv"

tenant_id  =  "{tenant_id}"
client_id  =  "{client_id}"
client_secret  =  "{client_secret}"
purview_endpoint  =  "https://{purview_name}.purview.azure.com/"
purview_scan_endpoint  =  "https://{purview_name}.scan.purview.azure.com/"  

def  get_credentials():
credentials  =  ClientSecretCredential(client_id=client_id, client_secret=client_secret, tenant_id=tenant_id)
return  credentials

def  get_catalog_client():
credentials  =  get_credentials()
client  =  PurviewCatalogClient(endpoint=purview_endpoint, credential=credentials, logging_enable=True)
return  client 

body_input={
"keywords": keywords
}

try:
catalog_client  =  get_catalog_client()
except  ValueError  as  e:
print(e) 

try:
response  =  catalog_client.discovery.query(search_request=body_input)
df  =  pd.DataFrame(response)
jdf  =  pd.json_normalize(df.value)
jdf.to_csv(export_csv_path, index=False)
except  HttpResponseError  as  e:
print(e)

You can also refer to this blog which has detailed information about exporting Assets in csv.

Output:

enter image description here

AFAIK, there is no Rest API to fetch the Assets list from a collection. As per the ms docs, you can fetch the Asset details by using Guid, Unique Attribute and Classifications.