Run one sequential task after big MPI job in SLURM

1k Views Asked by At

I have a slurm job which I launch using batch script, say:

#! /bin/bash -l

#SBATCH --job-name=job1
#SBATCH -o stdout.log
#SBATCH -e stderr.log
#SBATCH --ntasks=160

cd $WORK/job1

mpirun ./mympitask # 1.)

./collect_results  # 2.) long-running sequential task.

The first step (1.) runs in parallel using MPI, however, the second step (2.) I need to do just needs one task and the rest of the tasks should be released so that I don't occupy them or spend useless CPU-time.

Is it possible to for example:

a) release all, except one tasks, and run the final step on one CPU?

b) specify a command that should be run after the sbatch job is done?

I was thinking about using an salloc call for the last step.

1

There are 1 best solutions below

3
On BEST ANSWER

These two options are available with SLURM

1) Before running the sequential post processing task, you can

scontrol update job=$SLURM_JOBID NodeList=`hostname`

In order to shrink the job size to one node.

I do not know if and how to shrink the job to one core.

2) An other option is to submit two jobs, the post processing job being dependent on the MPI job:

sbatch mpijob.slurm
sbatch -d afterok:<mpijob SLURM jobid> postprocessing.slurm

The non trivial (this is not rocket science though) part is to automatically retrieve the jobid of the first job.