Google Cloud's Datastream: Not replicating deletes to BigQuery

256 Views Asked by At

I have a stream setup with Datastream where a PostgreSQL database on Cloud SQL replicates to BigQuery. When data is inserted to the PostgreSQL database, replication to BigQuery has been working just fine.

I went to delete a row and it's not being replicated and returning an error:

event_code: "UNSUPPORTED_EVENTS_DISCARDED"
message: "Discarded 1 unsupported events for BigQuery destination: project_id.dataset.my_table, with reason code: BIGQUERY_UNSUPPORTED_PRIMARY_KEY_CHANGE, details: Failed to write to BigQuery due to an unsupported primary key change: adding primary keys to existing tables is not supported.."
object_name: "my_table"

So now, I have 179 rows in the PostgreSQL database and 180 in BigQuery because it's not replicating the delete. Not sure what is going on since I did not change anything with Primary Keys and I'm not adding any Primary Keys...

The only similar question I've seen is this one but has to do with a table partition and a new table needs to be created. Does a new table need to be created for each delete as well?

1

There are 1 best solutions below

0
On

It's likely that a Datastream bug is the cause of this problem, or may be the problem is caused by anything in your data, as datastream won't be able to replicate a delete, if you are deleting a row that is linked to by a foreign key in another table.

In order to determine whether a foreign key constraint is blocking the delete, check the BigQuery table. Or, try using a different technique, like SQL console or Cloud SQL API, to delete the row from PostgreSQL database.

you can also try making a new BigQuery table and replicate the PostgreSQL data to it if you are still unable to fix the problem, because in addition to overwriting current BigQuery table, this will also fix any problems with the primary key.

Hope it works :)