We run 2 queries from a table in 2 scenarios:
We run a query to create a table from selecting data of another table (ex: ABC), this table ABC has no any policy tags (data masking rules) on any columns: create table
project.dataset.ABC_NOT_MASKED
as select col1, col2 fromproject.dataset.ABC
We add a policy tag with data masking rule (Hash (SHA256)) on some columns for this ABC table and run a similar query: create table
project.dataset.ABC_MASKED
as select col1, col2 fromproject.dataset.ABC
These 2 queries resulted the same cost (even there are data masking policy on the columns or not). The bytes processed of these 2 queries are the same, about 99 GiB. According to https://cloud.google.com/dlp/pricing#inspection_and_transformation_pricing, we assumed that the query with data masking will have bigger cost.
Why is this? How is the billing calculated in data masking?