I am facing some issues with spark cassandra connector filtering for java. Cassandra allows the filtering by last column of the partition key with IN clause. e.g
create table cf_text
(a varchar,b varchar,c varchar, primary key((a,b),c))
Query : select * from cf_text where a ='asdf' and b in ('af','sd');
sc.cassandraTable("test", "cf_text").where("a = ?", "af").toArray.foreach(println)
How count I specify the IN clause which is used in the CQL query in spark? How range queries can be specified as well?
Just wondering, but does your Spark code above work? I thought that Spark won't allow a
WHERE
on partition keys (a
andb
in your case), since it uses them under the hood (see last answer to this question): Spark Datastax Java API Select statementsIn any case, with the Cassandra Spark connector, you are allowed to stack your
WHERE
clauses, and anIN
can be specified with aList<String>
.Note that the normal rules of using IN with Cassandra/CQL still apply here.
Range queries function in a similar manner: