How Cassandra make sure consistency when adding a new node

468 Views Asked by At

I am little confused about how cassandra making sure consistency when add a new node to the cluster. I know cassandra will do the range movements and stream the data to new added node. Question is that does cassandra also stream the secondary replica's data to new added node.

For example, we have 4 nodes in the cluster with RF=3 (A,B,C,D) A(x=1, y=2), B(x=1, y=3), C(x=1), D(y=2). Partition key "x" will hold by A,B,C, while partition key "y" will hold by D,A,B. If I add a new node A' between A and B. I think it will stream partition "x" from A. But does it also stream partition "y" from B or D?

If it does stream partition "y", which node will cassandra choose to streaming from? From the official document. It will stream from primary replica which is D. If that's the case, when D has stale data (it is ok before adding new node, as both A and B and latest data, which meets the quorum), after streaming, it is possible to query out stale data from D and A'. Am I right?

2

There are 2 best solutions below

0
On

You are probably right. Running nodetool repair is recommended before adding a new node so that there is no inconsistency in the cluster.

4
On

Cassandra will stream information from the node that is giving up ownership of the token

That is, in your example: RF=3 (A,B,C,D) A(x=1, y=2), B(x=1, y=3), C(x=1), D(y=2). If E is added between A, B and A will give up owning X to E and B will give up owning y. Then A will send its value of X to E and B will send its value of Y to E - so the end result will be A(y=2), E(X=1,y=3), B(x=1), C(x=1), D(y=2).

Please note that after adding the node A has a stale copy of X and B has a stale copy of Y and they should run 'nodetool cleanup' to get rid of that.