Programmatically adding tags to Data Catalog Custom entries

533 Views Asked by At

I am trying to attach tags to data catalog custom entries. I am trying to create a python function to perform data catalog operations i.e. create/delete custom entries, create/delete tag templates, attach tags to the fields of the created custom entries.

I was able to create a custom entry and a tag template using the datacatalog_v1 library, however I don't find a method or a rest API to attach the tags fields to the custom entry columns.

I am however able to complete via the GCP web UI console

1

There are 1 best solutions below

0
On

You could see the next couple of examples on how to work with a data catalog REST API, and refer to the documentation that Google provides here.

  1. Create an entry group

Before using any of the request data, make the following replacements:

  1. project-id: Your GCP project ID

  2. entryGroupId: The ID must begin with a letter or underscore, contain only English letters, numbers and underscores, and be at most 64 characters.

3.displayName: The textual name for the entry group.

HTTP method and URL:

POST https://datacatalog.googleapis.com/v1/projects/project-id/locations/us-central1/entryGroups?entryGroupId=entryGroupId

Request JSON body:

{
  "displayName": "Entry Group display name"
}

Save the request body in a file called request.json, and execute the following command:

$cred = gcloud auth application-default print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
 -Method POST `
 -Headers $headers `
 -ContentType: "application/json; charset=utf-8" `
 -InFile request.json `
 -Uri "https://datacatalog.googleapis.com/v1/projects/project-id/locations/us-central1/entryGroups?entryGroupId=entryGroupId" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{
  "name": "projects/my_projectid/locations/us-central1/entryGroups/my_entry_group",
  "displayName": "Entry Group display name",
  "dataCatalogTimestamps": {
    "createTime": "2019-10-19T16:35:50.135Z",
    "updateTime": "2019-10-19T16:35:50.135Z"
  }
}

You can structure your tags by topic using tag templates. For example:

  1. A data governance tag with fields for: data governor, retention date, deletion date, PII (yes or no), data classification (public, confidential, sensitive, regulatory)

  2. A data quality tag with fields for: quality issues, update frequency, SLO information

  3. A data usage tag with fields for: top users, top queries, average daily users