The repair of cassandra data

1.4k Views Asked by At

I have a playground for cassandra cluster - 7 nodes (v.2.2.4) on servers hardware without network problems. RF is equal 3. For loading data was started a script generating test data. A "Table" has about 2b records.

I ran subrange repair procedure during executing of the script. In result, repairing some segments was failed. Then was start the upgradesstables procedure which was executed with errors too, so sstablescrub was started. After a sstablescrub procedure, a repair was failed on some segments again.

Which reasons can provide repair troubles in my case?

Should sstablescrub be started on each node of the cluster?

The script for subrange repair. I hope it may be useful for somebody.

ring=( $($vCSBIN/nodetool ring | grep -oE '[-]?[0-9]{19}') )

for ((i=0; i<$((${#ring[@]}-1)); i++));
    do
        echo "st = ${ring[i]}, et = ${ring[i+1]}"
        $vCSBIN/nodetool repair -st "${ring[i]}" -et "${ring[i+1]}"
    done

The fragment of system.log:

INFO  [Thread-59449] 2016-02-22 13:14:08,916 RepairSession.java:237 - 

[repair #110b9d40-d990-11e5-a89b-41ca3fbac573] new session: will sync cassandra1111.mydomain.com/10.0.0.0.77, /10.0.0.0.85, /10.0.0.0.192 on range (-4991002611964638502,-4985574971950992136] for ks1.[t1, counters]
INFO  [Repair#1997:1] 2016-02-22 13:14:08,918 RepairJob.java:107 - [repair #110b9d40-d990-11e5-a89b-41ca3fbac573] requesting merkle trees for t1 (to [/10.0.0.0.85, /10.0.0.0.192, cassandra1111.mydomain.com/10.0.0.0.77])
INFO  [Repair#1997:1] 2016-02-22 13:14:08,918 RepairJob.java:181 - [repair #110b9d40-d990-11e5-a89b-41ca3fbac573] Requesting merkle trees for t1 (to [/10.0.0.0.85, /10.0.0.0.192, cassandra1111.mydomain.com/10.0.0.0.77])
ERROR [ValidationExecutor:7] 2016-02-22 13:14:08,919 Validator.java:246 - Failed creating a merkle tree for [repair #110b9d40-d990-11e5-a89b-41ca3fbac573 on ks1/t1, (-4991002611964638502,-4985574971950992136]], /10.0.0.0.77 (see log for details)
INFO  [AntiEntropyStage:1] 2016-02-22 13:14:08,920 RepairSession.java:181 - [repair #110b9d40-d990-11e5-a89b-41ca3fbac573] Received merkle tree for t1 from /10.0.0.0.77
WARN  [RepairJobTask:1] 2016-02-22 13:14:08,920 RepairJob.java:162 - [repair #110b9d40-d990-11e5-a89b-41ca3fbac573] t1 sync failed
INFO  [Repair#1997:2] 2016-02-22 13:14:08,920 RepairJob.java:107 - [repair #110b9d40-d990-11e5-a89b-41ca3fbac573] requesting merkle trees for counters (to [/10.0.0.0.85, /10.0.0.0.192, cassandra1111.mydomain.com/10.0.0.0.77])
org.apache.cassandra.exceptions.RepairException: [repair #110b9d40-d990-11e5-a89b-41ca3fbac573 on ks1/t1, (-4991002611964638502,-4985574971950992136]] Validation failed in cassandra1111.mydomain.com/10.0.0.0.77
INFO  [Repair#1997:2] 2016-02-22 13:14:08,920 RepairJob.java:181 - [repair #110b9d40-d990-11e5-a89b-41ca3fbac573] Requesting merkle trees for counters (to [/10.0.0.0.85, /10.0.0.0.192, cassandra1111.mydomain.com/10.0.0.0.77])
com.google.common.util.concurrent.UncheckedExecutionException: org.apache.cassandra.exceptions.RepairException: [repair #110b9d40-d990-11e5-a89b-41ca3fbac573 on ks1/t1, (-4991002611964638502,-4985574971950992136]] Validation failed in cassandra1111.mydomain.com/10.0.0.0.77
Caused by: org.apache.cassandra.exceptions.RepairException: [repair #110b9d40-d990-11e5-a89b-41ca3fbac573 on ks1/t1, (-4991002611964638502,-4985574971950992136]] Validation failed in cassandra1111.mydomain.com/10.0.0.0.77
INFO  [AntiEntropyStage:1] 2016-02-22 13:14:08,920 RepairSession.java:181 - [repair #110b9d40-d990-11e5-a89b-41ca3fbac573] Received merkle tree for t1 from /10.0.0.0.85
ERROR [RepairJobTask:1] 2016-02-22 13:14:08,921 RepairSession.java:290 - [repair #110b9d40-d990-11e5-a89b-41ca3fbac573] Session completed with the following error
org.apache.cassandra.exceptions.RepairException: [repair #110b9d40-d990-11e5-a89b-41ca3fbac573 on ks1/t1, (-4991002611964638502,-4985574971950992136]] Validation failed in cassandra1111.mydomain.com/10.0.0.0.77
INFO  [AntiEntropyStage:1] 2016-02-22 13:14:08,921 RepairSession.java:181 - [repair #110b9d40-d990-11e5-a89b-41ca3fbac573] Received merkle tree for t1 from /10.0.0.0.192
ERROR [RepairJobTask:1] 2016-02-22 13:14:08,921 RepairRunnable.java:243 - Repair session 110b9d40-d990-11e5-a89b-41ca3fbac573 for range (-4991002611964638502,-4985574971950992136] failed with error [repair #110b9d40-d990-11e5-a89b-41ca3fbac573 on ks1/t1, (-4991002611964638502,-4985574971950992136]] Validation failed in cassandra1111.mydomain.com/10.0.0.0.77
org.apache.cassandra.exceptions.RepairException: [repair #110b9d40-d990-11e5-a89b-41ca3fbac573 on ks1/t1, (-4991002611964638502,-4985574971950992136]] Validation failed in cassandra1111.mydomain.com/10.0.0.0.77
0

There are 0 best solutions below