Getopt sees extra '--' argument that I haven't included in the command

75 Views Asked by At

I'm trying to write some code that will tie together a series of conda tools and a bit of my own python code. I've provided some getopt options, but they parse weirdly. I'd like options to be able to be provided in any order, and I'd like both short and long options available to use. I've provided the code where I define default options as well as the main getopt section. Here's the relevant code snippet:

seq_tech=''
input=''
outfolder='ProkaRegia'
threads=$(grep ^cpu\\scores /proc/cpuinfo | uniq |  awk '{print $4}')

is_positive_integer() {
    re='^[0-9]+$'
    if ! [[ $1 =~ $re ]] ; then
       return 1
    fi
    if [ "$1" -le 0 ]; then
        return 1
    fi
    return 0
}
...

ARGS=$(getopt -o i:o::t::s:c::h:: -l 'input:,output::,threads::,seq_tech:,clean::,help::' -n 'prokaregia.sh' -- "$@")
eval set -- "$ARGS"

echo "All arguments: $@"

if [[ $# -eq 1 ]]; then
    usage
fi

while [[ $# -gt 0 ]]; do
    case $1 in
        -i|--input)
            input="$(readlink -f "$2")"
            echo "$2"
            shift 2
            ;;
        -o|--output)
            output=$2
            shift 2
            ;;
        -t|--threads)
            if is_positive_integer "$2"; then
                threads=$2
                shift 2
            else
                echo "Error: Thread count must be a positive integer."
                exit 1
            fi
            ;;
        -s|--seq_tech)
            if [[ $2 == "ont" || $2 == "pacbio" ]]; then
                seq_tech=$2
                shift 2
            else
                echo "Error: Sequencing technology must be either 'ont' or 'pacbio'."
                exit 1
            fi
            ;;
        -c|--clean)
            clean_option=true
            shift
            ;;
        -h|--help)
            usage
            ;;
        *)
            echo "Error: Invalid option $1"
            exit 1
            ;;
    esac
done

However, on running the following command:

bash prokaregia.sh -t 2 -i prokaregia.dockerfile

I get the following returned:

All arguments: -t  -i prokaregia.dockerfile -- 2
Error: Thread count must be a positive integer.

is_positive_integer works perfectly from the command line (thanks chatgpt!), and changing the command line options to "--input" and "--threads" results in the same behavior, as does changing the order. I'm fairly certain the issue is coming from whatever is generating the extra double hyphen in the list of arguments. It's also generating an extra blank space, since when I try echo "$2 in the threads function it returns a blank. Various other issues arise with other options from these same characters, which I'd be happy to go into if people think it would be helpful.

2

There are 2 best solutions below

0
Marco Bonelli On BEST ANSWER

You seem to be interpreting the two colons after an option name in the wrong way. The syntax t:: does not mean "option -t is optional", it means "option -t takes an optional argument". All options are considered optional, you should check if mandatory options are supplied by yourself.

As per why the argument does not get parsed, this looks like a quirk of getopt. In case of options with optional arguments (two colons :: after the option name), getopt only recognizes the argument if it is specified along the option itself without whitespace between the two. So for example -t2 or --threads=2 works, while -t 2 or --threads 2 does not. This is probably because in the case of optional option arguments it is impossible for getopt to understand whether you are giving an option without argument followed by a positional argument OR an option with an argument followed by no positional argument.

For mandatory arguments however this does not seem to happen and you can write any of -t2, -t 2, --threads 2 or --threads=2 and get the same result.

I don't think you want optional option arguments anyway, so just remove all the additional colons you added and everything should work as you wish:

ARGS=$(getopt -o i:o:t:s:ch -l 'input:,output:,threads:,seq_tech:,clean,help' -n 'prokaregia.sh' -- "$@")

In any case, the -- will always be printed by getopt to let you know that there are no more options and from that point onwards you only have positional arguments.

0
Ted Lyngmo On

The argument -- is a common way for programs to separate options (and their arguments) from the rest of the arguments. getopts stops parsing options if it sees --. This makes it possible to give options as -o -- -p and -p would there not be parsed as an option but be left to the program as a regular argument.

What it shows you after -- is therefore all the arguments not parsed as options (or arguments to options) - after rearranging them into something that you can easily loop over until -- is found.