Two variable in one for loop in bash

174 Views Asked by At

I have a directory and inside I have two file types : *.sai and *fastq and I vant to use both variable in one shell for loop:

for j in *sai *fastq

 do bwa samse $j $j > ${j%.sai}.sam 

done;

after command do I want to load corresponding *.sai and *.fastq data in to the program (bwa samse). Could you help me please with syntax?

EXAMPLE:

in one directory is xx.fast xx.sai yy.fastq yy.sai and program bwa samse need to process in one time two corresponding files - bwa samse xx.fastq xx.sai...

Many thanks for any ideas.

4

There are 4 best solutions below

2
On BEST ANSWER

Try doing this with bash parameter expansion:

for j in .*sai; do  
    [[ -s ${j%.sai}.fastq ]] &&
        bwa samse "$j" "${j%.sai}.fastq" > "${j%.sai}.sam"     
done

and please, stop killing kitties with parsing ls output. (not for you Incorigible)

0
On

Using GNU Parallel it looks like this:

parallel bwa samse ref.fasta {} {.}.fastq '>' {.}.sam  ::: *.sai   

GNU Parallel is a general parallelizer and makes is easy to run jobs in parallel on the same machine or on multiple machines you have ssh access to. It can often replace a for loop.

If you have 32 different jobs you want to run on 4 CPUs, a straight forward way to parallelize is to run 8 jobs on each CPU:

Simple scheduling

GNU Parallel instead spawns a new process when one finishes - keeping the CPUs active and thus saving time:

GNU Parallel scheduling

Installation

If GNU Parallel is not packaged for your distribution, you can do a personal installation, which does not require root access. It can be done in 10 seconds by doing this:

(wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash

For other installation options see http://git.savannah.gnu.org/cgit/parallel.git/tree/README

Learn more

See more examples: http://www.gnu.org/software/parallel/man.html

Watch the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1

Walk through the tutorial: http://www.gnu.org/software/parallel/parallel_tutorial.html

Sign up for the email list to get support: https://lists.gnu.org/mailman/listinfo/parallel

0
On

Try not to use ls to feed the loop. Use brace expansion to only include *.sai and *.fastq files in your loop:

for j in ./*.{sai,fastq}
do
    ## do what you need to the *.sai & *.fastq files 
done

You can also provide a path variable:

mypath=/path/to/files
for j in "${mypath}"/*.{sai,fastq}
(snip)

NOTE: No clue what bwa samse $j $j > ${j%\.*}.sam does. Explain how you need to process the files and I can help further..

If there is a 1-to-1 relationship (matching .sai and .fastq files), then just:

for j in ./*.sai
do
    fname="${j%.*}"   # remove the extension ($fname is filename w/o ext)
    ## do what you need to the *.sai & *.fastq files 
    #  bwa samse "${fname}.sai" "${fname}.fastq" whatever else
done
3
On

(editted to reflect the comments--using ls to list filenames isn't necessary)

To strip the file extension you'll need to use ${j%\.*}, which will retain all characters before the last .

for j in *.sai *.fastq
do
    bwa samse $j $j > ${j%\.*}.sam 
done;