Cassandra nodetool
has a command called cleanup
:
cleanup [keyspace][cf_name]
Triggers the immediate cleanup of keys no longer belonging to this node. This has roughly the same effect on a node that a major compaction does in terms of a temporary increase in disk space usage and an increase in disk I/O. Optionally takes a list of column family names.
My questions are:
- When will a node having keys not belonging to it?
- When should I issue a cleanup?
- Should I do cleanup regularly (e.g. once per week)?
When you have added new nodes to the cluster, decreased replication factor or moved tokens.
After one of the above operations, if you need to save disk space. There is no harm in delaying running it - there is a performance impact and the only reason to is to save disk space.
No, only if you need to save space after one of the above operations.