I am little new to cassandra data modelling. I am trying to understand if i can have high unique values in clustering key. for eg: we have 4 columns. Storeid, shipping_status, orderid and guestname. We have approximately 3000 stores, 4 status type and high orderids each day. We need to query on storeid , status and sometimes orderids. So I am trying to keep storeid and status as partition key and orderid as clustering key. So my question is can i keep such a lowest granularity level column in clustering key. orderid will have huge unique ids each day. Also will there be any problem if i add guestname too in clustering key. tnx for your suggestions.
Granularity level in clustering key( high unique values)
212 Views Asked by john cena At
1
There are 1 best solutions below
Related Questions in CASSANDRA
- How to perform ordering in cassandra
- Kong: Running Mashape Kong fails on Mac OS X
- Cassandra spark connector data loss
- How to insert a custom type with map<text, boolean> field using cqlsh in Cassandra?
- How to setup cassandra and spark
- Error running spark app using spark-cassandra connector
- Where are the API docs for org.apache.spark.sql.cassandra for Spark 1.3.x?
- java.sql.SQLSyntaxErrorException: name provided was not in the list of valid column labels:
- Cassandra query on 2 dates
- Cassandra WordCount Hadoop
- Cassandra: range select with incorrect result
- How to export data from Cassandra to mongodb?
- Spark Cassandra SQL can't perform DataFrame methods on query results
- Why is my cassandra insert rate better with a client/node in the same host than with client and one node in separate hosts?
- Does Cassandra support aggregation function or any other capabilities like Map Reduce?
Related Questions in DATA-MODELING
- Best Practice for adding columns to a Table in Oracle database
- How to design table to store user settings?
- Anchor modeling - tie: make first role?
- Remove constraint and table name version details in sql developer data model
- How to model data for in-memory processing
- Can inheritance be modelled in app engine datastore by same kind and different properties?
- App engine datastore denormalization: index properties in the main entity or the denormalized entity?
- How to properly install grib-api and jasper library to run flexpart model in ubuntu x64?
- Phantom DSL modeling case classes
- Granularity level in clustering key( high unique values)
- Web analytics customer segmentation data modeling with Cassandra?
- Data modelling ( secondary index vs clustering key )
- maximum secondary indexes on a columnfamily
- Dynamodb data model for process/transaction monitoring
- Read before write in cassandra
Related Questions in CASSANDRA-2.0
- Why is my cassandra insert rate better with a client/node in the same host than with client and one node in separate hosts?
- Apache Cassandra 2.1.6 Not Binding to port 9042
- how many partition key for a Cassandra table?
- How to delete a record in Cassandra?
- How exactly batch work in cql
- Cassandra inconsistencies in batch inserts/updates
- Cassandra requires restart when ALTER TABLE ADD COLUMN to table with existing data records
- Granularity level in clustering key( high unique values)
- Web analytics customer segmentation data modeling with Cassandra?
- No viable alternative at input '>' in cassandra
- Facing issue while adding a node in the existing cluster(data center) in Cassandra
- Exception in thread "main" com.datastax.driver.core.exceptions.InvalidQueryException: Unknown definition referenced in PRIMARY KEY
- Cassandra Bulk Loader
- Set Cassandra's replication factor in cassandra.yaml
- Selenium web driver with Cassandra service
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
Using
storeidandshipping_statusas parts of the partition key and then usingorderidas a clustering key makes the situation very similar to time series data.Cassandra is well suited to store things with that data model (aka "wide rows" in pre-CQL terms) and the limit is set on 2x10E9 (2 billions) values of the clustering key per partition.
So you should not go for "open-ended" partitions, but use chunking: you could have a partition key which is
storeid + status + yearis the volume of orders per year is much less than 2x10E9, orstoreid + status + year + monthif you're Amazon.To answer your second question, no, there is no problem to have tables where all the columns are part of the primary key.