I had a working TiDB instance running in gcloud, deployed using the tidb-ansible scripts. I wanted to replace the PD nodes with new ones, so I destroyed and replaced those. The PD cluster comes up ok now, but when I try to start the TiKV nodes, I get this error:
2018/02/28 01:42:08.091 node.rs:191: [ERROR] cluster ID mismatch: local_id 6520261967047847245 remote_id 6527407705559138241. you are trying to connect to another cluster, please reconnect to the correct PD
There's a good explanation of the error in the TiDB FAQs (https://pingcap.com/docs/FAQ/):
-- The cluster ID mismatch message is displayed when starting TiKV. --
This is because the cluster ID stored in local TiKV is different from the cluster ID specified by PD. When a new PD cluster is deployed, PD generates random cluster IDs. TiKV gets the cluster ID from PD and stores the cluster ID locally when it is initialized. The next time when TiKV is started, it checks the local cluster ID with the cluster ID in PD. If the cluster IDs don’t match, the cluster ID mismatch message is displayed and TiKV exits.
If you previously deploy a PD cluster, but then you remove the PD data and deploy a new PD cluster, this error occurs because TiKV uses the old data to connect to the new PD cluster.
But no explanation about how to fix the problem. Is there a way to destroy the local cluster ID on the TiKV instance so it can hook up with the PD properly?
Will PD be able to coordinate my existing TiKV nodes (with existing data) if I can get them to talk again?
To hook up the TiKV instance with PD properly, you can change the PD’s cluster ID. See the following steps and example.
Yes, it will.
You can use "pd-recover" to fix this issue.
Step 1. Run the new pd-server with the cluster ID:
6527407705559138241
.Step 2. Change the cluster ID to
6520261967047847245
.Step 3. Restart the PD server.
Note that PD has a MONOTONIC unique ID allocator which is the
alloc-id
. All region IDs and peer IDs are generated by the allocator. So please make sure the ID you choose for step 2 is big enough that no existing IDs can exceed, otherwise it will corrupt TiKV.Example