When using 2P CRDT data structures (for example 2P-set), how do you free up space?

368 Views Asked by At

2P-set allows to remove the elements from a set, but doesn't allow to free the space those removed elements take up. In fact, removal of an element consumes space, rather that frees it. What's the algorithm to free up space for 2P structures?

I'm trying to understand for what problems can I use CRDT structures in practice. Without a way to free up space, the 2P CRDT structures seem to have a very limited use for the real world tasks.

1

There are 1 best solutions below

3
On

While I cannot speak for 2P-Set - since I still haven't figured out a practical use case for it. However usually we can apply few techniques:

  1. Compaction of metadata used by CRDT: a lot of CRDTs where initially implemented with very simple design, and later on optimized to meet industry standards. Example of such can be OR-Set reimplemented on top of dotted vector versions. In this implementation you don't need to keep removed elements in memory: instead we can track added/removed elements using dots that could eventually be compressed into vector clocks. Here I described this problem in more detail.
  2. Prunning can be useful, once some of the replicas are no longer needed, eg. because we reduced a number of nodes or these nodes are no longer available. In such case, we can merge the payload with metadata as if it was produced by another replica. Example: given G-Counter represented with map {A:1,B:2,C:1} and a dead node B (which can no longer increment its state), we could prune it by merging B's entry into shape {A:3,C:1}, therefore reducing its size while still preserving the correct value. The problem is that prunning algorithm must guarantee, that all replicas must converge into this decision independently.