Data Lineage in purview insufficient

2k Views Asked by At

Azure Purview at moment shows the data lineage from ADF for only Copy activities. Is this sufficient? In this article it is given: "By pushing metadata from Azure Data Factory into Azure Purview a reliable and transparent lineage tracking is enabled." Is this above and beyond the copy activity? If yes how can we achieve this?

Is there any other way in Azure to view complete data lineage? Assume we are using ADF/Synapse/Azure Databricks.

1

There are 1 best solutions below

0
On

Tools such as Data Factory, Data Share, Synapse, Azure Databricks, and so on, belong to category of data systems. The list of data processing systems currently integrated with Purview for lineage are seen here Azure Purview Data Catalog lineage user guide

Currently Azure Data Factory, supports scope: Copy activity , Data flow activity , Execute SSIS package activity And the integration between Data Factory and Purview supports only a subset of the data systems that Data Factory supports, as described here.

Azure Purview currently doesn't support query or stored procedure for lineage or scanning. Lineage is limited to table and view sources only.

Some additional ways of finding information in the lineage view, include the following:

  • In the Lineage tab, hover on shapes to preview additional information about the asset in the tooltip .
  • Select the node or edge to see the asset type it belongs or to switch assets.
  • Columns of a dataset are displayed in the left side of the Lineage tab. For more information about column-level lineage, see Dataset column lineage.

Custom lineage reporting is also supported via Atlas hooks and REST API. Data integration and ETL tools can push lineage in to Azure Purview at execution time.

Connecting an Azure Purview Account to a Synapse workspace allows you to discover Azure Purview assets and interact with them through Synapse capabilities.

Here is a list of the Azure Purview features that are available in Synapse:

  • Use the search box at the top to find Purview assets based on keywords
  • Understand the data based on metadata, lineage, annotations
  • Connect those data to your workspace with linked services or integration datasets
  • Analyze those datasets with Synapse Apache Spark, Synapse SQL, and Data Flow
  • Overview of the metadata, view and edit schema of the metadata with classifications, glossary terms, data types, and descriptions
  • View lineage to understand dependencies and do impact analysis.
  • View and edit Contacts to know who is an owner or expert over a dataset
  • Related to understand the hierarchical dependencies of a specific dataset. This experience is helpful to browse through data hierarchy.