How to use AWS sbatch (SLURM) inside docker on an EC2 instance?

342 Views Asked by At

I am trying to get OpenFOAM to run on an AWS EC2 cluster using AWS parallelCluster.

One possibility is to compile OpenFOAM. Another is to use a docker container. I am trying to get the second option to work.

However, I am running into trouble understanding how I should orchestrate the various operations. Basically what I need is :

  1. copy an OpenFOAM case from S3 to FSx file system on the master node
  2. run the docker container containing OpenFOAM
  3. Perform OpenFOAM operations, some of them using the cluster (running the computation in parallel being the most important one)

I want to put all of this into scripts to make it reproducible. But I am wondering how should I structure the scripts together to have SLURM handle the parallel side of things.

My problem at the moment is that the Master node shell knows the command e.g. sbatch but when I launch docker to access the OpenFOAM command, it "forgets" the sbatch commands.

How could I export all SLURM related commands (sbatch, ...) to docker easily ? Is this the correct way to handle the problem ?

Thanks for the support

1

There are 1 best solutions below

2
On

for the first option there is a workshop that walks you through: cfd-on-pcluster.

For the second option; I created a container workshop that uses HPC container runtimes containers-on-pcluster.

I incorporated a section about GROMACS but I am happy to add OpenFOAM as well. I am using Spack to create the container images. While I only documented single-node runs, we can certainly add multi-node runs.

Running Docker via sbatch is not going to get you very far, b/c docker is not a user-land runtime. For more info: FOSDEM21 Talk about Containers in HPC

Cheers
Christian (full disclosure: AWS Developer Advocate HPC/Batch)