Slurm sbatch job fail

2.9k Views Asked by At

I am writing a script test.job to submit job using sbatch. The script is as below.

#!/bin/bash

#SBATCH -J test

#SBATCH --time=00:01:00

#SBATCH -N 2

#SBATCH -n 2

#SBATCH -o logs/%j.sleep

#SBATCH -e logs/%j.sleep

echo test

Then i run with sbatch test.job and the job failed with error message

JobId=8672 JobName=test
   UserId=xxx(2379) GroupId=users(100) MCS_label=N/A
   Priority=4294893104 Nice=0 Account=(null) QOS=(null)
   JobState=FAILED Reason=NonZeroExitCode Dependency=(null)
   Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=1:0
   RunTime=00:00:00 TimeLimit=00:01:00 TimeMin=N/A
   SubmitTime=2021-05-03T03:04:21 EligibleTime=2021-05-03T03:04:21
   AccrueTime=2021-05-03T03:04:21
   StartTime=2021-05-03T03:04:22 EndTime=2021-05-03T03:04:22 Deadline=N/A
   SuspendTime=None SecsPreSuspend=0 LastSchedEval=2021-05-03T03:04:22
   NumNodes=2 NumCPUs=2 NumTasks=2 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
   TRES=cpu=2,mem=4G,node=2,billing=2
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
   MinCPUsNode=1 MinMemoryCPU=2G MinTmpDiskNode=0
   Features=(null) DelayBoot=00:00:00
   OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)

Any idea what I did wrong?

1

There are 1 best solutions below

0
On

It is easier to debug such problems by running in real time with:

srun test.job

Then perhaps you will see the error and be able to fix. Eg: log folder permissions or test.job isn't set as executable