Multiple single core srun on a node

57 Views Asked by At

I have a csh script that looks like this

foreach n (`seq 1 1000000`)
  ./myprog${n}.x
end

I want to parallelize it and run it on my slurm cluster, and because each instance of the program requires only 1 core, I want to use a node (or a few nodes) to run many at a time

#!/bin/csh 
#SBATCH --nodes=8
#SBATCH -n 1024
#SBATCH --ntasks-per-node=128
foreach n (`seq 1 1000000`)
  srun -N 1 -n 1 ./myprog${n}.x &
end
wait

When I do this, it seems like it's only running 1 at a time on a given node, although it's difficult to tell. Is there an option I can add to srun or an #SBATCH header I can add that will allow me to run on all of the cores I've requested?

1

There are 1 best solutions below

0
AndyT On

How you do this can vary with the version of Slurm that is running. However, one example is given at:

https://docs.archer2.ac.uk/user-guide/scheduler/#example-4-256-serial-tasks-running-across-two-nodes

Note: this assumes you have exclusive node access. Essentially, you loop over nodes assigned to the job and then loop over the tasks you want to place on them. Example job submission script based on your one (note: you will need to modify the --mem option to be a suitable value for the total amount of memory available on the compute nodes you are using).

#!/bin/bash
#SBATCH --job-name=MultiSerialOnComputes
#SBATCH --time=0:10:0
#SBATCH --nodes=8
#SBATCH --ntasks-per-node=128
#SBATCH --cpus-per-task=1


# Get a list of the nodes assigned to this job in a format we can use.
#   scontrol converts the condensed node IDs in the sbatch environment
#   variable into a list of full node IDs that we can use with srun to
#   ensure the subjobs are placed on the correct node. e.g. this converts
#   "nid[001234,002345]" to "nid001234 nid002345"
nodelist=$(scontrol show hostnames $SLURM_JOB_NODELIST)

# Loop over the nodes assigned to the job
for nodeid in $nodelist
do
    # Loop over 128 subjobs on each node pinning each to a different core
    foreach n (`seq 1 128`)
    do
        # Launch subjob overriding job settings as required and in the background
        # Make sure to change the amount specified by the `--mem=` flag to the amount
        # of memory required. The amount of memory is given in MiB by default but other
        # units can be specified.
        srun --nodelist=${nodeid} --nodes=1 --ntasks=1 --ntasks-per-node=1 \
        --exact --mem=1500M ./myprog${n}.x &
    done
done

# Wait for all subjobs to finish
wait

This does not manage to do the 100000 tasks you originally specified but you should be able to come up with some arithmetic to make this work so you split your total number of tasks across the number of nodes you are assigned (or you can setup a set of jobs that end up with exactly the right number of tasks per node).