Bash module load function not working as expected

462 Views Asked by At

I have a batch script that I eventually want to execute on a cluster via condor_submit. The script needs to load some module via "module load matlab/R2020a". However nothing works.

The script looks like this:

#!/bin/bash
module load cudnn/8.2.0-cu11.x
module load cuda/11.2
module load matlab/R2020a
echo $PATH
echo $SHELL
#Check matlab version 
echo_and_run() { echo "$*" ; "$@" ; }
matlab -e | grep "MATLAB="
echo_and_run matlab -e | grep "MATLAB="  
...
setup input etc. 
...
echo_and_run matlab -nodisplay -batch "......matlab commands"

When I run it from my home shell it gives me:

...
/bin/bash
MATLAB=/is/software/matlab/linux/R2014a
MATLAB=/is/software/matlab/linux/R2014a
...

Which is both not correct. When executing this in my local shell ( source ./scriptname.sh) The output is even more confusing:

...
/bin/bash
MATLAB=/is/software/matlab/linux/R2020a
MATLAB=/is/software/matlab/linux/R2014a
...

So the matlab version updates, but only for the non "echo_and_run" execution (the first call). In the actual call it also is the default 2014 version.

What in the world is going on? I checked the $PATH variables and they are identically to my running shell. I tried sourcing ~/.bashrc at the top of the script, no difference. When I type "type module" I can see that it is a function:

module is a function
module () 
{ 
    _module_raw "$@" 2>&1
}

Some older posts mention, that I should either run with "source" or "." (for sh) but I cannot do that, since the script is called by condor_submit eventually. Or I should find the file that defines module. However I do not know what other file (besides ~/.bashrc) that could be.

Old post: https://unix.stackexchange.com/questions/194893/why-cant-i-load-modules-while-executing-my-bash-script-but-only-when-sourcing

Edit

I am currently trying everything locally (in the login shell) which could be different from the execution shell, but even here I get this weird behaviour.

Edit II:

    + type -a _module_raw
_module_raw is a function
_module_raw () 
{ 
    unset _mlshdbg;
    if [ "${MODULES_SILENT_SHELL_DEBUG:-0}" = '1' ]; then
        case "$-" in 
            *v*x*)
                set +vx;
                _mlshdbg='vx'
            ;;
            *v*)
                set +v;
                _mlshdbg='v'
            ;;
            *x*)
                set +x;
                _mlshdbg='x'
            ;;
            *)
                _mlshdbg=''
            ;;
        esac;
    fi;
    unset _mlre _mlIFS;
    if [ -n "${IFS+x}" ]; then
        _mlIFS=$IFS;
    fi;
    IFS=' ';
    for _mlv in ${MODULES_RUN_QUARANTINE:-};
    do
        if [ "${_mlv}" = "${_mlv##*[!A-Za-z0-9_]}" -a "${_mlv}" = "${_mlv#[0-9]}" ]; then
            if [ -n "`eval 'echo ${'$_mlv'+x}'`" ]; then
                _mlre="${_mlre:-}${_mlv}_modquar='`eval 'echo ${'$_mlv'}'`' ";
            fi;
            _mlrv="MODULES_RUNENV_${_mlv}";
            _mlre="${_mlre:-}${_mlv}='`eval 'echo ${'$_mlrv':-}'`' ";
        fi;
    done;
    if [ -n "${_mlre:-}" ]; then
        eval `eval ${_mlre}/usr/bin/tclsh8.6 /usr/lib/x86_64-linux-gnu/modulecmd.tcl bash '"$@"'`;
    else
        eval `/usr/bin/tclsh8.6 /usr/lib/x86_64-linux-gnu/modulecmd.tcl bash "$@"`;
    fi;
    _mlstatus=$?;
    if [ -n "${_mlIFS+x}" ]; then
        IFS=$_mlIFS;
    else
        unset IFS;
    fi;
    unset _mlre _mlv _mlrv _mlIFS;
    if [ -n "${_mlshdbg:-}" ]; then
        set -$_mlshdbg;
    fi;
    unset _mlshdbg;
    return $_mlstatus
}
1

There are 1 best solutions below

0
Xavier Delaruelle On

Provided script output does not help to understand if there is an issue at the environment modules level.

Adding a module list command in your script after the 3 module load commands may help to determine if the module function has properly loaded your environment or not.

In some situations, like when running script on a cluster through a batch scheduler, it is good to source the module initialization script at the start of such script to ensure the module function is defined. It seems that you are running on a Debian-like system, so the initialization script may be sourced with:

. /usr/share/modules/init/bash