I use spring batch to load data from a file to a database.The job contains only one step.I use a ThreadPoolTaskExecutor to execute step concurrently.The step is similar to this one.
public Step MyStep(){
return StepBuilderFactory.get("MyStep")
.chunk(10000)
.reader(flatFileItemWriter)
.writer(jdbcBatchItemWriter)
.faultTolerant()
.skip(NumberFormatException.class)
.skip(FlatFileParseException.class)
.skipLimit(3)
.throttleLImit(10)
.taskExecutor(taskExecutor)
.build();
}
There are 3 "numberformat" errors in my file,so I set skipLimit 3,but I find that when I execute the job,it will start 10 threads and each thread has 3 skips,so I have 3 * 10 = 30 skips in total,while I only need 3.
So the question is will this cause any problems?And is there any other way to skip exactly 3 times while executing a step concurrently?
github issue
Robert Kasanicky opened BATCH-926 and commented
When chunks execute concurrently each chunk works with its own local copy of skipCount (from StepContribution). Given skipLimit=1 and 10 chunks execute concurrenlty we can end up with successful job execution and 10 skips. As the number of concurrent chunks increases the skipLimit becomes more a limit per chunk rather than job.
However, this seems to be a correct thing for a very old version of spring batch. I have a different problem in my code, but the skipLimit seems to be aggregated when I use multiple threads. Albeit the job sometimes hangs without properly failing when SkipLimitException is thrown.