Is it possible to read selectively from a Apache Cassandra row? I heard that "Typically an entire row is read behind the scenes every time a read query is fired". Is it possible to reduce pressure on the database engine by reading selective columns? By reducing pressure I'm not talking about the typical avoid select * which would translate into less IO and make the data movement across the network faster; I'm talking about internally does the DB engine pull the entire row into memory before serving the results? I'm being conscious about wide rows and would like to make my read occupy a very small foot print.
I get how to avoid full writes by selectively updating/writing to the column/s you care. This question is very specific to reads.
Yes, you can page through the rows if you just a filter on the partition key, not the clustering column(s).
For example, a table of video comments like this:
For a video which has 100K comments, Cassandra will retrieve the latest N comments on first pass because the driver has paging enabled by default (5000 rows). But to page through the rest of the rows, the partition needs to be serialised on heap to iterate over the rows until you get the required subset.
If you want to avoid having to load large partitions, you will need to model your data accordingly so you avoid your partitions getting really large (wide). Cheers!