Running a python QIIME script on multiple files

790 Views Asked by At

I am trying to make a script file to run a python script (from the QIIME pipeline) on multiple files without typing the script every time (I have roughly 150 files and more coming).

I use a virtualbox to run an ubuntu environment.

I started by creating a file "splitvm3.sh" using gedit

This file contains :

#!/bin/sh
# this is the script for the VM3 experiment ~/splitvm3.sh

split_libraries.py -m mappingVM3001.txt -b 0 -p -f DNA12115-001-L1-R1-ACGCTCGACA.fna -q DNA12115-001-L1-R1-ACGCTCGACA.qual -o split_library_output001

split_libraries.py -m mappingVM3002.txt -b 0 -p -f DNA12115-002-L1-R1-AGACGCACTC.fna -q DNA12115-002-L1-R1-AGACGCACTC.qual -o split_library_output002

then I used the command :

chmod +x ./splitvm3.sh

from the directory where my file is stored.

and finally I run the script by typing :

python splitvm3.sh

I have the error message :

SyntaxError: invalid syntax

Apparently it is pointing at line 4 of my file.

I totally lack the basic knowledge to understand what is going wrong. I started this whole ubuntu/python/QIIME thing 2 weeks ago and learning everything by myself. Every little bit of help would be greatly appreciated !

Seb

4

There are 4 best solutions below

0
On

I know that this is an old question and the issue is probably sorted by now, but the error is comming from the multiple lines in your file.

Qiime scripts can deal with multiple files as long as they are in the correct format.

Try saving your "splitvm3.sh" file as:

split_libraries.py -m mappingVM3001.txt -b 0 -p -f DNA12115-001-L1-R1-ACGCTCGACA.fna,DNA12115-002-L1-R1-AGACGCACTC.fna -q DNA12115-001-L1-R1-ACGCTCGACA.qual,DNA12115-002-L1-R1-AGACGCACTC.qual -o split_library_output

Then run from the same directory where both .fna and .qual files are stored:

python splitvm3.sh 
0
On

The problem is that you are trying to run a shell script using the python interpreter. As true as it is that split_libraries.py is a python script, the script you are trying to is in fact a shell script.

You almost have it right you just have to execute the script like this:

sh splitvm3.sh

Or given that you have a shebang you could also just:

./splitvm3.sh
1
On

I do not know what about 'split_libraries.py'

It seems that this script writes the error message.

Cope this "line 4" and invoke it directly in your terminal. What happens? Where did you get the '...txt' file?

Is one of the input files in wrong format or wrong encoding?

0
On

This is an old question, but I've had lab members ask me about this, so I'd like to add that I've had good luck running QIIME scripts on multiple files with

find . -name "*.fastq" -exec qiimescriptname.py {} \;
Alternatively, I run do loops in bash, such as:

for file in data/*; do usearch32 -fastq_filter "${file}" -fastq_maxee 0.5 -fastq_truncqual 19 -fastq_qmax 45 -fastaout "${file}.fasta"; done;