I am making a simple Sprint batch application that reads from a CSV file and writes it into the database. I want to throw an exception when we try to trigger a job that is already running. So before starting the job, I am checking if there is already a job running using this code
JobInstance lastJobInstance = jobExplorer.getLastJobInstance(job.getName());
if (lastJobInstance != null) {
List<JobExecution> runningExecutions = jobExplorer.getJobExecutions(lastJobInstance);
for (JobExecution execution : runningExecutions) {
if (execution.getStatus() == BatchStatus.STARTED || execution.getStatus() == BatchStatus.STARTING) {
throw new IllegalStateException("Job is already running: " + job.getName());
}
}
}
But In my DB (batch_job_execution table )i see an entry :
| job_execution_id | version | job_instance_id | create_time | start_time | end_time | status | exit_code | exit_message | last_updated | job_configuration_location |
|---|---|---|---|---|---|---|---|---|---|---|
| 13 | 1 | 13 | 2023-08-16 13:36:59.410 | 2023-08-16 13:36:59.459 | STARTED | UNKNOWN | 2023-08-16 13:36:59.459 |
This means that there was some Job that started but never completed and the reason for it is not known. Its exit code shows UNKNOWN. This job was run weeks ago it didn't fail nor completed (The reason is not known) Also when a job has started and is currently going on its exit code also shows UNKNOWN.
I am not able to understand how to filter this job out and make sure to not run a job if it is already running.
So now whenever Im trying to run a new job it gives me "Job is already running" error because of this particular entry. Is there any way to filter out this entry because it is a very old entry and it might not get completed or failed ever
My DB batch_job_execution
This is done automatically by Spring Batch if you correctly define your job instances. In your example, if the input file is an identifying job parameters, running the same job instance will lead to a
JobExecutionAlreadyRunningException. So there is no need to write the code you shared and manually check the existence of running executions.Now if the job execution was abruptly terminated, then it is up to you to decide if you want to restart it or abandon it. In this case, you need to manually change its status in the database, see Aborting a Job.