Attaching tags to columns in data catalog using python

364 Views Asked by At

I have a table in BQ and I am trying to attach tags to columns depending on the prefix. For ex., all the columns which start with ABC_ have to be tagged with the Private Info tag.

I have written below code -

dataset_id = 'my_dataset'
for table in bigquery_client.list_tables(dataset_id):
    # Get the schema of the table
    table_ref = f'{project_id}.{dataset_id}.{table.table_id}'
    table = bigquery_client.get_table(table_ref)
    schema = table.schema
    table_id = table.table_id

    # Loop through the schema fields, and create tags for columns that match the criteria
    for field in schema:
        if field.name.startswith('RUR_'):
            tag = datacatalog.Tag()
            tag.template = f'projects/{project_id}/locations/us-central1/tagTemplates/{tag_template_id}'
            tag.fields['owner'].string_value = 'John Doe'
            tag = datacatalog_client.create_tag(parent=f'projects/{project_id}/locations/us-central1/entryGroups/{entry_group_id}/entries/{table_id}/fields/{field.name}', tag=tag)
            print(f'Tag created for column {field.name} in table {table_id}')

But I am getting the error Resource Name Invalid, saying that a column cannot be a resource.

Can someone suggest how this can be done in GCP.

Thanks in advance:)

1

There are 1 best solutions below

0
guillaume blaquiere On

You have to rely on the API description here. As you can see, the parent parameter ends after the entries/{table_id}. You have to put the column in the body of the request, i.e. in the TAG object.

enter image description here