I have a table in Cassandra which has almost 80 million+ records(may be more than that). I have updated the schama which adds a new column in the table. Now I need to update the column values. I wrote a migration script to do that using cassandra-driver. Tried batching, token but the data is so huge that it is taking more than 3 hrs and still not updating the records (process getting terminated after 2-3 hrs.)
What is the best way to handle this type of migration ? Is there any other way to achieve this?
Update a column in a table having huge data (80mn+ rows) in cassandra
210 Views Asked by programoholic At
1
There are 1 best solutions below
Related Questions in NODE.JS
- How to solve CERT_UNTRUSTED error in nodemailer
- Run a loop over a callback, node js
- Implementing prerender.io middleware in sails.js
- Token based authorization in nodejs/ExpressJs and Angular(Single Page Application)
- formatting path string in javascript
- One to One screensharing using WEBRTC
- Create polygon from grid (for collisions)
- Strange npm behavior when installing packages like grunt
- Convert JSON.gz to JSON in node js
- "Your npm version is outdated." but it's not. While install yo
- Why put methods on the prototype of a class instead of declaring them in the constructor?
- Node JS Async Response
- mongoose get property from nested schema after `group`
- Cannot Receive Incoming call on Twilio android Client
- How can I change a specific line in a file with node js?
Related Questions in CASSANDRA
- How to perform ordering in cassandra
- Kong: Running Mashape Kong fails on Mac OS X
- Cassandra spark connector data loss
- How to insert a custom type with map<text, boolean> field using cqlsh in Cassandra?
- How to setup cassandra and spark
- Error running spark app using spark-cassandra connector
- Where are the API docs for org.apache.spark.sql.cassandra for Spark 1.3.x?
- java.sql.SQLSyntaxErrorException: name provided was not in the list of valid column labels:
- Cassandra query on 2 dates
- Cassandra WordCount Hadoop
- Cassandra: range select with incorrect result
- How to export data from Cassandra to mongodb?
- Spark Cassandra SQL can't perform DataFrame methods on query results
- Why is my cassandra insert rate better with a client/node in the same host than with client and one node in separate hosts?
- Does Cassandra support aggregation function or any other capabilities like Map Reduce?
Related Questions in AMAZON-KEYSPACES
- How do I sort data by the last update date in Cassandra?
- How to delete all rows in Cassandra Keyspace
- Will DynamoDB get Materialized Views?
- How to fix problem "Unable to complete the operation against any hosts" in Cassandra?
- Cannot connect to Amazon Keyspaces with cqlsh
- Enable Json Insert on Amazon Keyspaces
- Amazon Keyspace maximum table partition size
- How the read/write capacity is increased/decreased with amazon keyspaces provisioned capacity and application auto scaling
- Amazon Keyspace NoHostAvailableException
- How to replace logged batch in AWS Amazon Keyspace
- Is it possible to get a callback from Cassandra after the INSERT operation that the record was created successfully?
- Cannot connect Amazon Keyspaces using rails CQL
- Alternative to IN clause not supported in Amazon Keyspaces
- Update a column in a table having huge data (80mn+ rows) in cassandra
- Unable to connect Java Spring Boot application to Amazon Keyspaces
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
Usually for such things it's easier to use Spark (although I'm not sure hot it works with Amazon Keyspaces). It's quite hard to do range scan correctly - you need to handle edge cases, etc. (I have an example for Java driver that uses the same algorithm as Spark Cassandra Connector and DSBulk).
You can use Python with Spark and Cassandra Connector to update your data - the complexity of update will depend on your algorithm.
Another approach is put logic into your App - if it receives from Cassandra
nullfor given column, you can return calculated value.