Can I filter data across different columns in CQL with sorted order?

24 Views Asked by At

If I have the below data set:

id  year location
------------------
1   2000  Tokyo
1   2001  NewYork
2   2000  London
2   2002  NewYork

and a CQL table that has,

PRIMARY KEY is (id,year);

I would like create a materialized view to have each id for those records with location as NewYork.

The resulting view will have to look like:

id  year location
------------------
1   2001  NewYork
2   2002  NewYork

Basically, I need to sort by year first then with location next. Is it possible? Thanks.

1

There are 1 best solutions below

0
Nadav Har'El On

To your final question, "I need to sort by year first then with location next. Is it possible?" the answer is sort of, but not quite:

A materialized view may indeed have a different sorting order (clustering key) than the base table. But as it always happens in Scylla or Cassandra, this sorting is only between rows of a single partition - and the order between different partitions is unspecified. So if "id" remains the view's partition key, you won't have a global ordering by year - you'll have only the items in a single id ordered by year. If this is not what you wanted, you can achieve global ordering of all the items in the view by putting all the items in the same partition - but the result will be very inefficient if you have a lot of items because a single partition will only be hosted on one shard (node in Cassandra, CPU in Scylla), or rather RF of them.

It seems you are also asking whether it is possible to filter into the view only items whose location matches "New York". The answer is that it is only possible if you make "location" one of the key columns in the view (for an explanation, see https://github.com/scylladb/scylladb/issues/4250). I think this is ok for you, because you asked for the location to be used for sorting - so it will be a new clustering key column in the view.