GCP Datastream is used for historical and CDC data. I am getting historical load but not incremental/CDC data into Bigqery. All the configuration for datastream and Mysql is configured properly as I have tested the same in other gcp cloud project environment. I have another gcp cloud project environment where I have to implement this but,there's no incremental data getting into Bigquery.

I am expecting to get CDC/incremental data using datastream from Mysql. I checked the MySQL binlog files, the data is getting written properly

1

There are 1 best solutions below

0
On

Am assuming the configuration from mysql is to load to cloud storage then from cloud storage push to say BiqQuery ? If that is your implementation, use

  1. Datastream to load this to Cloud(create source& destination connection profiles),
  2. create a pub/Sub topic ,
  3. Then to move data to big query using CDC use dataflow- there is a Datastream to BigQuery template, you will specify the dataset to sink this to and associate that with the pub/sub topic created, if successfuly the respective tables will be createD with additional metadata columns.

Note too detailed but thats a summary of how i was able to implement CDC from mysql.