I'm using dsbulk to load data into Cassandra cluster. Configuration currently includes -maxErrors 0
to fail fast in case of any issue.
It's not clear for me how retry strategy defined by
advanced.retry-policy.class =
"com.datastax.oss.dsbulk.workflow.commons.policies.retry.MultipleRetryPolicy"
advanced.retry-policy.max-retries = 10
works with 0 allowed errors.
Will failed query be retried 10 times before entire operation is aborted or retries will not be performed at all?
The entire load process is aborted in case of at least one issue but it's not clear from the logs if failed query is retried or not.
We need additional details to help triage this.
./dsbulk --version
output? If this is any lesser than1.10.0
, I'd strongly recommend that you download the latest version for your OS from here../dsbulk load
) by including--dsbulk.log.stmt.level EXTENDED
Having said that, DataStax Bulk Loader (aka DSBulk in short) offers a replay strategy by which you could resume the failed operation to process the new and failed records after you've fixed the underlying problem (e.g. data problem, cluster instability, etc.,) by using the checkpoint file (default or if you've overwritten it using
--dsbulk.log.checkpoint.file string
). The checkpoint information will already be available in both the console and in the log file for convenience.