I have a 3 master and 3 replica open source Redis cluster which have Snapshot enabled so RDB file is being created for all masters and replicas.
I want to safekeep the data in S3 bucket as well. I am trying to understand what should be the best strategy to do so?
Should I simply send all 6 RDB file in S3 bucket?
In case of disaster when it’s time restore the backup, how will I know which RDB to use to restore? Because the cluster topology can change in the interim (replicas can become a master of master fails)
As long as your master/replica host or even a fresh new host is running on the same Redis version (or older), you can copy over the RDB file into the Redis data directory of the destination host and start it up.
During start-up, it will load the configurations from the Redis configuration file and set the node’s role accordingly.
Having said that, you could just back up the master’s RDB to S3 and use it for all replicas for that particular shard.
Note that the RDB file is a point-in-time snapshot of the data. If the source Redis server continues to operate and modify data after you've created the RDB file, those changes will not be reflected in the copied RDB file.
To ensure data consistency, it's best to perform this operation when the source Redis server is in a relatively quiescent state (i.e. not under heavy write activity).