I am trying to upload this job via a .sh script to a cluster with SLURM, using the COMSOL software:
#!/bin/bash
#SBATCH --job-name=my_work
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=20
#SBATCH --mem=20G
#SBATCH --partition=my_partition
#SBATCH --time=4-0
#SBATCH --no-requeue
#SBATCH --exclusive
#SBATCH -D $HOME
#SBATCH --output=Lecho1_%j.out
#SBATCH --error=Lecho1_%j.err
cd /home/myuser/myfile/
module load intel/2019b
module load OpenMPI/4.1.1
module load COMSOL/5.5.0
comsol batch -mpibootstrap slurm -nn 20 -nnhost 20 -inputfile myfile.mph -outputfile
myfile.outout.mph -study std1 -batchlog myfile.mph.log
and when doing so I get the following error message:
Fatal error in PMPI_Init_thread: Other MPI error, error stack:
MPIR_Init_thread(805): fail failed
MPID_Init(1743)......: channel initialization failed
MPID_Init(2137)......: PMI_Init returned -1
Can anyone tell me what it means and how to fix it completely?
The way you call COMSOL is incorrect. Submission script should contain the following lines to run COMSOL in a cluster with SLURM: