Hi all (cassandra noob here),
I'm trying understand exactly what going on..with repair..trying to get to the point where we can run our repairs on a schedule.
I have setup a 4 DC (3 nodes per dc) Cassandra 2.1 cluster, 4gb Ram (1gb heap), HDD.
Due to various issues..(repair taking too long/crashing nodes with OOM), I decided to start fresh, nuke everything, I deleted /var/lib/cassandra/data, and /opt/cassandra/data
Recreated my keyspaces (no data), and ran nodetool repair -par -inc -local
I was supprised to see it took ~5min to run, watching the logs I see merkle trees being generated...
I guess..my question is, if my keyspaces have no data..whats it generating Merkle trees against?
After running a local-dc repair against each node in that dc, I decided to run a cross-dc repair, again with no data..
this time it took 4+ hours..with no data? This really feels wrong to me, what am I missing?
Thanks