I want to query a cassandra table using the spark-cassandra-connector using the following statements:
sc.cassandraTable("citizens","records")
.select("identifier","name")
.where( "name='Alice' or name='Bob' ")
And I get this error message:
org.apache.spark.SparkException: Job aborted due to stage failure:
Task 0 in stage 81.0 failed 4 times, most recent failure:
Lost task 0.3 in stage 81.0 (TID 9199, mydomain):
java.io.IOException: Exception during preparation of
SELECT "identifier", "name" FROM "citizens"."records" WHERE token("id") > ? AND token("id") <= ? AND name='Alice' or name='Bob' LIMIT 10 ALLOW FILTERING:
line 1:127 missing EOF at 'or' (...<= ? AND name='Alice' [or] name...)
What am I doing wrong here and how can I make an or
query using the where
clause of the connector?
Your
OR
clause is not valid CQL. For this few key values (I'm assumingname
is a key) you can use anIN
clause.The
where
clause is used for pushing downCQL
to Cassandra so only validCQL
can go inside it. If you are looking to do a Spark Side Sql-Like syntax check out SparkSql and Datasets.