How exactly should I use GCP Data Catalog?

96 Views Asked by At

I work on a project that involves hundreds of tables across dozens of datasets. However, we currently lack a catalog to easily locate specific data within these tables, or even to know whether they are being updated, their data origin, and so on.

I suggested that we could implement a data catalog for this purpose, but I am a bit uncertain about the concept of a data catalog. For instance, if I were to create a tag template with all the necessary information about the tables (such as origin, presence of Personally Identifiable Information (PII), owner, update frequency, etc.) and then proceed to tag all the tables, what would be the next step?

Can I, for example, generate a PDF containing all the values tagged to the tables? Alternatively, when a new data engineer needs to familiarize themselves with the tables, should they access the Dataplex interface and begin their search from there?

I would appreciate any suggestions regarding the effective utilization of the data catalog. Thank you!

0

There are 0 best solutions below