process different files using name with practitioner - Spring batch

655 Views Asked by At

**I am working on the Spring Batch that will read all CSV files from the folder and need to do different process based on file bame. The below code is working fine with one file(name) type. but the issue is I need to decide the step execution based on "File Name". Anyone, please help with this logic?

My File Name format:

EMP.CRE.3434234.3.6.csv
EMP.UPT.3434234.3.7.csv
STD.CRE.3434234.3.8.csv
STD.UPT.3434234.3.9.csv

based on EMP or STD, I need to execute different processing logic. If it's CRE then create logic(step) and UPT means to update the data.

Below is my current code structure:

@Bean
public Job employeeJob() throws Exception{
  return jobs.get("employeeJob")
             .start(masterStep())
              .build();
}

@Bean
public step masterStep() throws Exception{
  return steps.get("masterStep").partitioner(slavestep()
             .partitioner("partition",partitioner())
              .taskExecutor(taskExecutor()).build();
}

@Bean
public step slaveStep() throws Exception{
  return steps.get("slaveStep").<EmployeeData, EmployeeData>chunk(10)
             .reader(reader(null))
             .processor(processor(null,null)
              .writer(writer()).build();
}

@Bean
@JobScope
public Partitioner partitioner() throws Exception{
  MultiResourcePartitioner partitioner = new MultiResourcePartitioner();
  PathMatchingResourcePatternResolver resolver= new PathMatchingResourcePatternResolver();
partitioner.SetResources(resolver.getResources("file:"+Resourcepath+ filetype));
partitioner.partition(20);
return partitioner;
}

@Bean
@StepScope
public FlatFileItemReader<EmployeeData> reader(@value("#stepExecutionContext['fileName] String file){
return new FlatFileItemReaderBuilder<EmployeeData>()
...
...
.build();

}

1

There are 1 best solutions below

1
On

I would not bother with scaling the job before making it work. You have two conditional flows here: 1) EMP or STD and 2) CRE or UPT.

  • For 1), I would create two steps (one for each type, following the unix philosophy of making one thing do one thing and do it well) and use a decider that decides which step to run based on the file name.

  • For 2) I would keep it simple and use a conditional statement in the writer to create or update items accordingly. The other option is to use a ClassifierCompositeItemWriter which is more elegant but is more elaborate to implement.

In regards to scaling, I would create a job instance per file (again, following the unix philosophy) and use the filename as a job parameter. I'm not sure partitioning is a good option for your use case (even if it's possible to create a partition per file, the final solution would be convoluted compared to the job instance per file approach).