Understanding and Defining Primay Keys in Cassandra

62 Views Asked by At

In Cassandra, lets say I have a table called cats

In the table cats, lets say that a cat can only be uniquely identified based on both the color of the cat and the name of the cat, so for instance;

color: blue, name: henry

There could however be many cats that are blue in color. Similarly there could be many cats named Henry. However there is only one cat who is named Henry whose color is blue.

My question is, what should I define as the primary key and how. For instance, should I make the name the partition key, or should I make the color the partition key, or should I make both of these fields as part of the partition key? Would it be beneficial to add the color and/or the name as a clustering key? I have also read about hashed values as the partition key; would having a separate hashed value as the partition key and adding the name and color as secondary indexes offer any benefit here?

What are the performance impacts here? What type of table setup would be the most performant?

Users will search by either name AND color OR just the color, but never just on the name.

Thanks in advance.

1

There are 1 best solutions below

0
On BEST ANSWER

To retrieve data from the table, values for all columns defined in the partition key have to be supplied. Since in your case the user can query by only the color and also both color and name you would need following definition

PRIMARY KEY (color, name);

With this you can query with color only as it is a partition key and also query with both of the keys