I run it with these settings:
dsbulk unload -k keyspace -t table
--connector.csv.delimiter "^"
--engine.maxConcurrentQueries=4
--connector.csv.url
...
application complains about connection pool exhaustion --> application gets timeouts on connections to cassandra.
- cassandra version 2.13
- cassandra features: 3 nodes - 64 cpu/124Gb ram on each node.
explain on settings dsbulk?
It sounds like your cluster is getting overloaded and cannot handle the
unload
operation.You will need to throttle DSBulk to lower the amount of requests. Here are some options you can use as starting points to limit the load on your cluster:
For details on these options, see:
With these settings, it will take a little longer for the
unload
operation to complete but it will at least minimise the risk of taking down your cluster. Cheers!