This is the first time I am working with Cassandra and I have a data structure like following, want to save it inside the Cassandra:
{"user_id": "123",
"user_cards": {
"first_card": {
"card_number": 456
}
}
}
I searched through the internet and found an example like following:
use json;
CREATE type json.sale ( id int, item text, amount int );
CREATE TABLE json.customers ( id int PRIMARY KEY, name text, balance int, sales list> );
INSERT INTO json.customers (id, name, balance, sales)
VALUES (123, 'Greenville Hardware', 700,
[{ id: 5544, item : 'tape', amount : 100},
{ id: 5545, item : 'wire', amount : 200}]) ;
But I am not sure if this is the best way of doing that? I remember from my experience with MySQL or MongoDB/Mongoose, we were defining separate tables for nested JSON objects and put the Foreign key of that table inside a column of the parent table(Or do population in Mongo).
As I heard Cassandra is against normalization and it's better to do denormalization on it, I don't know what approach should I take?
Also I have another question about data compression, Will cassandra take care of it if I do something like the example I put above?
On the surface, I think this is a good approach. My only concern would be if each customer has many sales...like millions. But if the business use isn't going to yield much more than a few thousand or so, this is probably ok.
If not, it might make sense to add a date/time component to the partition key like year, or something:
That'll make sure that sales will be capped per partition by year.
The other thing to consider, is the query pattern this needs to support. Right now, it only supports querying by
id. But if that's ok, then you should be good-to-go!