On demand snapshot and CDC from Postgresql multitenant DB

36 Views Asked by At

I have a Postgresql multitenant DB where each table contains tenant level metadata. e.g.

                                     Table "connector"
     Column     |            Type             | Collation | Nullable |          Default          
----------------+-----------------------------+-----------+----------+---------------------------
 id             | uuid                        |           | not null | public.uuid_generate_v4()
 tenant_id      | character varying(255)      |           | not null | 
 last_sync_date | timestamp without time zone |           |          | 
 type           | character varying(255)      |           | not null | 
 created_date   | timestamp without time zone |           |          | 
 modified_date  | timestamp without time zone |           |          | 
 created_by     | character varying(255)      |           |          | 
 modified_by    | character varying(255)      |           |          | 
 version        | bigint                      |           |          | 
 is_active      | boolean                     |           | not null | true

I am using AWS RDS and we have one DB per region. We need to support data migration scenarios wherein data for one tenant needs to be moved from one region DB to another. Basically the requirements are :

  • Trigger data migration on tenant (customer's request) on demand
  • This would include re-creating an initial snapshot of the tenant data and al incremental updates via CDC till the actual migration is completed.
  • Have an easy data consistency validation mechanism.
  • Migration completion is considered complete with a switch of all traffic for one tenant from source region to destination region.

I am looking at Debezium's CDC with incremental snapshots via signalling tables. But I am not sure if it supports the below:

  • Does it support on-demand DB replication start/stop? I will start replication once with the trigger and stop it on migration completion
  • I don't want to trigger snapshot every time with incremental snapshots. Can I do it for only certain tenants?
  • Do I need to create a new replication slot for each tenant that I wan't to migrate. Creation of too many replication slots can cause WAL logs accumulating in case of consumer failures.

I am considering CDC mechanism so as to avoid doing bulk data exports in chunks during migration as that can be slow and error prone. Would really appreciate some suggestions here as I am new to RDBMS CDC in general.

I have tried to capture whatever I felt was relevant to my scenario. Please let me know in case any specific area which is unclear and needs more explanation.

0

There are 0 best solutions below