Disk space size remains same even after Cassandra TTL (Time to live)

1.9k Views Asked by At

I have implemented the TTL feature in my code and its removing the records successfully. But the data storage size remains the same. It should be reduced immediately. latest Cassandra document says that the data will be removed after the first compaction(It Doesn't have clarity).

my simple question

I need to ensure the size immediately as soon as completed the TTL. So How to ensure whether disk space is reduced or not?

Please share your ideas..

Thanks

2

There are 2 best solutions below

0
On

Cassandra persists it's data in sstables. AFAIK know, during normal writes, sstables will get appended (not updated, deleted). It is compaction, like you mentioned yourself, that merges sstables. This can result in more free disk space when removing tombstones (deleted columns).

As can be read here, compaction is a background process. You can influence the frequency by configuration, like described here. I don't think there is a way to force a minor compaction with a Cassandra client. You might force a major compaction using nodetool compact. However, keep in mind that a compaction is quite I/O intensive. So I don't think that's a real solution.

But why is it so important that the removed data is immediately physically removed?

0
On

nodetool repair -pr on each node will remove you're tombstoned records (and you should be doing this so often anyway based on gc_grace_seconds see http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html?scroll=concept_ds_ebj_d3q_gk).

For diskspace progress this could be as simple as running "df -h" before and after.