Run Jobs Sequentially, in the Background, in Bash

76 Views Asked by At

I'm writing a bash script to execute a python program with different values of parameters 'H' and 'K'. What I would like to happen is that only one instance of 'program.py' runs at a time and subsequent jobs are not started until the previous one finishes.

Here's what I tried -

#!/bin/bash

H="1 10 100"

K="0.01 0.1"

for H_val in ${H}
do
        for K_val in ${K}
        do
               { nohup python program.py $H_val $K_val; } &
        done
done

However, this just loops over all the parameter values without waiting for any one job to finish. Conversely, if I modify the above slightly by taking off the ampersand, I can run each job individually - but not in the background. Any ideas on how to proceed?

2

There are 2 best solutions below

0
Charles Duffy On BEST ANSWER

Put the & on the command you want to put in the background. In this context, that might be your whole loop:

#!/usr/bin/env bash
#              ^^^^- ensures that array support is available

hValues=( 1 10 100 ) # best practice is to iterate over array elements, not...
kValues=( 0.01 0.1 ) # ...words in strings!

{
  # perform conditional redirections akin to what nohup would do
  [ -t 0 ] && exec </dev/null      # stdin
  [ -t 1 ] && exec >myprogram.log  # stdout
  [ -t 2 ] && exec 2>&1            # stderr

  for hVal in "${hValues[@]}"; do
    for kVal in "${kValues[@]}"; do
      python program.py "$hVal" "$kVal"
    done
  done
} &

Notice how the & was moved to be after the done -- that way we background the entire loop, not a single command within it, so the backgrounded process -- running that loop -- invokes only one copy of the Python interpreter at a time.

The redirections (</dev/null, >myprogram.log, and 2>&1) are equivalent to how nohup would redirect stdout and stderr to nohup.out if they weren't already going to a file (or other non-TTY destination) to ensure that they aren't attached to a TTY; adjust the name myprogram.log to your preference.

0
hysteresis On

You can use the task spooler, it creates a queue of jobs to run and launches them one at a time, in order of submission (by default). When submitting a new job, it returns an assigned ID that you can further use to optionally add constraints; for example running a job only if another exited successfully. It also stores each job output stream separately.

Your script would then be:

#!/bin/bash

H="1 10 100"

K="0.01 0.1"

for H_val in "${H}"
do
    for K_val in "${K}"
    do
        tsp python program.py "${H_val}" "${K_val}"
    done
done

or more concisely, if you are set on doing the full parameter search, using parallel to avoid the explicit nested loops:

#!/bin/bash

H=(1 10 100)
K=(0.01 0.1)

# fill the queue with all parameter combinations
parallel tsp python program.py ::: "${H[@]}" ::: "${K[@]}" >/dev/null

After running your script, you can supervise the progression by calling tsp without arguments, and you can see the output of a job with tsp -c <jobID>.