I'm writing a spring batch job and in one of my step I have the following code for the processor:
@Component
public class SubscriberProcessor implements ItemProcessor<NewsletterSubscriber, Account>, InitializingBean {
@Autowired
private AccountService service;
@Override public Account process(NewsletterSubscriber item) throws Exception {
if (!Strings.isNullOrEmpty(item.getId())) {
return service.getAccount(item.getId());
}
// search with email address
List<Account> accounts = service.findByEmail(item.getEmail());
checkState(accounts.size() <= 1, "Found more than one account with email %s", item.getEmail());
return accounts.isEmpty() ? null : accounts.get(0);
}
@Override public void afterPropertiesSet() throws Exception {
Assert.notNull(service, "account service must be set");
}
}
The above code works but I've found out that there are some edge cases where having more than one Account per NewsletterSubscriber is allowed. So I need to remove the state check and to pass more than one Account to the item writer.
One solution I found is to change both ItemProcessor and ItemWriter to deal with List<Account> type instead of Account but this have two drawbacks:
- Code and tests are uglier and harder to write and maintain because of nested lists in writer
- Most important more than one
Accountobject may be written in the same transaction because a list given to writer may contain multiple accounts and I'd like to avoid this.
Is there any way, maybe using a listener, or replacing some internal component used by spring batch to avoid lists in processor?
Update
I've opened an issue on spring Jira for this problem.
I'm looking into isComplete and getAdjustedOutputs methods in FaultTolerantChunkProcessor which are marked as extension points in SimpleChunkProcessor to see if I can use them in some way to achieve my goal.
Any hint is welcome.
There isn't a way to return more than one item per call to an
ItemProcessorin Spring Batch without getting pretty far into the weeds. If you really want to know where the relationship between anItemProcessorandItemWriterexits (not recommended), take a look at the implementations of theChunkProcessorinterface. While the simple case (SimpleChunkProcessor) isn't that bad, if you use any of the fault tolerant logic (skip/retry viaFaultTolerantChunkProcessor), it get's very unwieldily quick.A much simpler option would be to move this logic to an
ItemReaderthat does this enrichment before returning the item. Wrap whateverItemReaderyou're using in a customItemReaderimplementation that does the service lookup before returning the item. In this case, instead of returning aNewsletterSubscriberfrom the reader, you'd be returning anAccountbased on the previous information.