Cassandra-driver Client.batch() gives RangeError

120 Views Asked by At

This code

const cassandra = require('cassandra-driver');
const Long = require('cassandra-driver').types.Long;
const client = new cassandra.Client({
  contactPoints: ['localhost:9042'],
  localDataCenter: 'datacenter1',
  keyspace: 'ks'
});
let q =[]
const ins_q = 'INSERT INTO ks.table1 (id , num1, num2, txt, date) VALUES (?,33,44,\'tes2\',toTimeStamp(now()));'
for (let i = 50000000003n; i < 50000100003n; i++) {
  q.push({query: ins_q, params: [Long.fromString(i.toString(),true)]})

}

client.batch(q, {  prepare: true }).catch(err => {
   console.log('Failed %s',err);
})

is causing this error

Failed RangeError [ERR_OUT_OF_RANGE]: The value of "value" is out of range. It must be >= 0 and <= 65535. Received 100000
    at new NodeError (f:\node\lib\internal\errors.js:371:5)
    at checkInt (f:\node\lib\internal\buffer.js:72:11)
    at writeU_Int16BE (f:\node\lib\internal\buffer.js:832:3)
    at Buffer.writeUInt16BE (f:\node\lib\internal\buffer.js:840:10)
    at FrameWriter.writeShort (f:\node\test\node_modules\cassandra-driver\lib\writers.js:47:9)
    at BatchRequest.write (f:\node\test\node_modules\cassandra-driver\lib\requests.js:438:17)

Is this a bug? I tried execute() with one bigint the same way and there was no problem.

"cassandra-driver": "^4.6.3"

1

There are 1 best solutions below

3
On

Failed RangeError [ERR_OUT_OF_RANGE]: The value of "value" is out of range. It must be >= 0 and <= 65535. Received 100000

Is this a bug?

No, this is Cassandra protecting the cluster from running a large batch and crashing one or more nodes.

While you do appear to be running this on your own machine, Cassandra is first and foremost a distributed system. So it has certain guardrails built in to prevent non-distributed things from causing problems. This is one of them.

What will happen here, is that the driver looks at the id and figures out real fast that a single node isn't responsible for all of the different possible values of id. So, it sends the batch of 100k statements to one node picked as the "coordinator." That coordinator "coordinates" retrieving each partition of data from all nodes in the cluster, and assembles the result set.

Or rather, it'll try to, but probably time-out before getting through even 1/5th of a batch this size. Remember, BATCH with Cassandra was built to really only run 5 or 6 write operations to keep 5 or 6 tables in-sync; not 100k write operations to the same table.

The way to approach this scenario, is to execute each write operation individually. If you want to optimize the process, make each write operation asynchronous with a listenable future. Run only a certain number of async threads at a time, block on their completion, and then run the next set of threads. Repeat this process until complete.

In short, there are many nuances about Cassandra that are different from a relational database. The use and implementation of BATCH writes being one of them.

Why does it cause a range error?

Because of this part in the error message:

It must be >= 0 and <= 65535

The Cassandra Node.js driver will not allow a batch to exceed 65535 statements. By the looks of it, it is being sent 100000 statements.