I would like to remove a node from my Cassandra cluster and am following these two related questions (here and here) and the Cassandra document. But I am still not quite sure the exact process.
My first question is: Is the following way to remove a node from a Cassandra cluster correct?
decommission
the node that I would like to remove.removetoken
the node that I just decommissioned.
If the above process is right, then how can I tell the decommission process is completed so that I can proceed to the second step? or is it always safe to do step 2 right after step 1?
In addition, Cassandra document says:
You can take a node out of the cluster with nodetool decommission to a live node, or nodetool removetoken (to any other machine) to remove a dead one. This will assign the ranges the old node was responsible for to other nodes, and replicate the appropriate data there. If decommission is used, the data will stream from the decommissioned node. If removetoken is used, the data will stream from the remaining replicas.
No data is removed automatically from the node being decommissioned, so if you want to put the node back into service at a different token on the ring, it should be removed manually.
Does this mean a decommissioned node is a dead node? In addition, as no data is removed automatically from the node being decommissioned, how can I tell when it is safe to remove the data from the decommissioned node (i.e., how to know when the data-streaming is completed?)
Removing a node from a Cassandra cluster should be the following steps (in Cassandra v1.2.8):
nodetool decommission
.From the docs:
Update: The above process also works for seed nodes. In such case, the cluster is still able to run smoothly without requiring an restart. When you need to restart the cluster for other reasons, be sure to update the
seeds
parameter specified in thecassandra.yaml
for all nodes.Decommission the target node
When decommission starts, the decommissioned node will be first labeled as
leaving
(marked asL
). In the following example, we will removenode-76
:When a decommissioned node is marked as
leaving
, it is streaming data to the other living nodes. Once the streaming is completed, the node will not be observed from the ring structure, and the data owned by the other nodes will increase:Removing the remaining data manually
Once the streaming is completed, the data stored in the decommissioned node can be removed manually as described in the Cassandra document:
This can be done by removing the data stored in the
data_file_directories
,commitlog_directory
, andsaved_caches_directory
specified in thecassandra.yaml
file in the decommissioned node.