I'm using Cassandra database with datastax driver. I need to do batch read from Cassandra of something to the order of 2000 rows. My use case is like, I get the list of ids in my request and those ids are my partitioning keys in Cassandra. I want to know if it's a good idea to spawn 2000 threads and get data from Cassandra in parallel (in that case reading the data will efficient as it goes to just one node) or is it possible to figure out a way to group ids which live in same node so that I can optimize the reads(now in this case I need to spawn much less threads and less overhead on Cassandra). Please let me know can I achieve batch read in an efficient way apart from spawning multiple threads. Thanks! PS: Data coming back from Cassandra is not that huge to cause OOM.
Locating cassandra partition node
639 Views Asked by User5817351 At
1
There are 1 best solutions below
Related Questions in CASSANDRA
- How to perform ordering in cassandra
- Kong: Running Mashape Kong fails on Mac OS X
- Cassandra spark connector data loss
- How to insert a custom type with map<text, boolean> field using cqlsh in Cassandra?
- How to setup cassandra and spark
- Error running spark app using spark-cassandra connector
- Where are the API docs for org.apache.spark.sql.cassandra for Spark 1.3.x?
- java.sql.SQLSyntaxErrorException: name provided was not in the list of valid column labels:
- Cassandra query on 2 dates
- Cassandra WordCount Hadoop
- Cassandra: range select with incorrect result
- How to export data from Cassandra to mongodb?
- Spark Cassandra SQL can't perform DataFrame methods on query results
- Why is my cassandra insert rate better with a client/node in the same host than with client and one node in separate hosts?
- Does Cassandra support aggregation function or any other capabilities like Map Reduce?
Related Questions in DATASTAX-JAVA-DRIVER
- Murmur3 partitioning in Cassandra
- Datastax cassandra object mapper setting consistency and if not exists
- Need schema suggestion in cassandra
- How exactly batch work in cql
- Cassandra is not working with UDT
- How to change the flush queue size of cassandra
- Cassandra Performance : Less rows with more columns vs more rows with less columns
- Exception in thread "main" java.lang.IllegalArgumentException: location.city is not a column defined in this metadata
- Cassandra query timestamp column
- Locating cassandra partition node
- How to create a graph and its schema without using Datastax Studio but through Java?
- Is executeGraph() really needed in DSE 5.0 Graph with Java?
- How to commit and rollback graph operations in Datastax DSE 5.0 Graph?
- Annotations for scala class to use mapper
- Inconsistent Error/ Exception, Codec not found for requested operation: [float <-> java.lang.Object] and [int <-> java.lang.Object]
Related Questions in SPRING-DATA-CASSANDRA
- Unable to connect to Cassandra cluster running on local host
- Unable to connect to cassandra from Docker container using spring-data-cassandra: NoHostAvailableException: All host(s) tried for query failed
- Facing issues with Cassandra Proof of concepts: Exception - com.datastax.driver.core.DataType.asJavaClass()Ljava/lang/Class;
- Cassandra query logging through spring configuration
- Locating cassandra partition node
- Spring Cassandra Operation with If Clause
- SET consistency level for Cassandra DDL
- Cassandra timeout
- How can we stop spring boot data cassandra from connecting to localhost?
- i hava a exception on spring data cassandra 1.4.5.RELEASE
- Composite Cassandra Key in Spring Boot Application
- Can't escape quote character in @Query in Spring Boot app for Cassandra
- Paging SELECT query results from Cassandra in Spring Boot application
- How to Query Cassandra using CassandraRepository with Spring Data Cassandra and Spring Boot?
- Paging And Sorting queries in Spring Data Cassandra
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
Yes it is, you can get Token Ranges for cassandra cluster and check occurrence for tokens for you ids in the ranges, and then group ids by nodes.
In additional:
There is no need to spawn many threads, datastax driver provides asynchronous api, we use it in our project to perform a lot of queries in parallel and it works enough good, but not excellent from performance point of view.
Necessity to perform thousands requests to read data indicates unsuitable data model. You should implement data model around queries to minimize number of request to have good performance.
Updated:
I suppose, you can use method Metadata.newToken to calculate token on driver side or directly get replicas with Metadata.getReplicas for a given partition key. But before it serialize the partition key according to its type and protocol version