I have an issue with range select in cassandra: it some time doesn't returns all data. It is a 2.1.0 cluster. Binaries available from Apache.
This is my table :
CREATE TABLE metrics.main_cnt (
gran ascii,
ctx ascii,
io ascii,
eid uuid,
dt bigint,
apdex_s counter,
apdex_t counter,
"count" counter,
error counter,
time counter,
PRIMARY KEY ((gran, ctx, io, eid), dt))
I have many rows in that table and if I execute this query:
SELECT * from main_cnt WHERE gran = 'min' AND ctx ='A' AND io = 'i' AND eid =4379eec6-ba09-4f70-8862-1c864595c371 and dt in (1420644000000, 1420640400000);
I get that result :
gran | ctx | io | eid | dt | apdex_s | apdex_t | count | error | time
------+-----+----+--------------------------------------+---------------+---------+---------+-------+-------+--------
min | A | i | 4379eec6-ba09-4f70-8862-1c864595c371 | 1420640400000 | 671 | 4 | 677 | 0 | 168253
min | A | i | 4379eec6-ba09-4f70-8862-1c864595c371 | 1420644000000 | 554 | 10 | 566 | 0 | 192666
But if i use the a range select like this:
SELECT * from main_cnt WHERE gran = 'min' AND ctx ='A' AND io = 'i' AND eid =4379eec6-ba09-4f70-8862-1c864595c371 and dt >= 1420640400000 and dt <= 1420644000000;
I only get one row:
gran | ctx | io | eid | dt | apdex_s | apdex_t | count | error | time
------+-----+----+--------------------------------------+---------------+---------+---------+-------+-------+--------
min | A | i | 4379eec6-ba09-4f70-8862-1c864595c371 | 1420640400000 | 671 | 4 | 677 | 0 | 168253
I also tried to increase the range but without better result. It is not the only case but if i change the dt parameter, i sometime get correct result with several rows.
A nodetool repair doesn't fix the problem.
I didn't find any ticket in Jira about such issue. Does anyone knows about this issue ? Thanks for any help.
Edit: more informations:
replication factor = 3
cluster has 8 or 9 nodes most of the time
increments are done with java driver 2.1.5 and prepared statements with this command: UPDATE main_cnt SET time = time + ?, \"count\" = \"count\" + ?, error = error + ?, apdex_s = apdex_s + ?, apdex_t = apdex_t + ? WHERE gran = ? AND dt = ? AND ctx = ? AND eid = ? AND io = ?
Trace for the normal select: trace1.log Trace for the incorrect range select: trace2.log
No idea why but the issue is fixed after upgrading the cluster to cassandra 2.1.8.