I am launching a MPI program through a SBATCH script like this (corresponding to an example script provided by the system administrators):
#! /bin/bash -l
#SBATCH --job-name=test
#SBATCH -o stdout.log
#SBATCH -e stderr.log
#SBATCH --ntasks=160
#SBATCH --time=0-00:10:00
#SBATCH --qos=normal
cd $WORK/
mpirun ./mpiprogram
However, it seems that sometimes more MPI processes are launched than NTASKS. Resulting in sometimes 200 processes when 160 was requested, sometimes different numbers. Note that the nodes have 16 or 20 cores. Some of the worker processes (all pretty much identical in my case) run much slower than the others, perhaps because of swapping. The swapping may be caused by too many processes on one node, causing them to use too much memory.
Should I specify the number of threads to mpirun using $SLURM_NTASKS
? Or what is going on here?