BioPython, how to convert from .fasta to .aln for clustal alignment?

3.5k Views Asked by At

I've a .fasta file that I wish to convert to .aln so that it can be aligned with the alignIO.read command or somehow give my fasta file "Clustal Headers" because when I use the fasta file it just outputs that it's not a known clustal header, is the "ClustalwCommandline" return supposed to do that, because in the tutorial it says to assign its return to cline, and just print cline, not sure what to do with cline

EDIT:- also I'm supposed to output a .dnd file, not sure how either

1

There are 1 best solutions below

2
On BEST ANSWER

You don't need to manually convert anything, for instance if you follow the code below:

>>> from Bio.Align.Applications import ClustalwCommandline

After importing the ClustalwCommandline, you can specify what would be the name of your alignment file, cline is a command that is being constructed in the line below:

>>> cline = ClustalwCommandline("clustalw", infile="opuntia1.fasta", outfile="opuntia1.aln")
>>> print cline
clustalw -infile=opuntia1.fasta -outfile=opuntia1.aln

Now, when you are writing the following line, cline() runs the command that was constructed above and returns the output and error messages to stdout and stderr variables respectivily. If you print stdout and stderr, you will find that stdout is printing the alignment related stuffs and as there was no error for the above command, stderr shows nothing if you print that. Meanwhile, in the output file called opuntia1.aln file contains the alignment now. Go and open that aln file; you should see the alignment.

>>> stdout, stderr = cline()
>>>
>>> print stdout

 CLUSTAL 2.1 Multiple Sequence Alignments


Sequence format is Pearson
Sequence 1: CDS         1574 bp
Sequence 2: EST          723 bp
Start of Pairwise alignments
Aligning...

Sequences (1:2) Aligned. Score:  9
Guide tree file created:   [opuntia1.dnd]

There are 1 groups
Start of Multiple Alignment

Aligning...
Group 1:                     Delayed
Alignment Score 490

CLUSTAL-Alignment file created  [opuntia1.aln]


>>> print stderr

For .dnd file, you don't need to specify the outfile, the default file after you run the code would create a dnd file from the fasta file. Here is a direct quote:

By default ClustalW will generate an alignment and guide tree file with names based on the input FASTA file, in this case opuntia.aln and opuntia.dnd, but you can override this or make it explicit

Source: http://biopython.org/DIST/docs/tutorial/Tutorial.html#sec89

Hope that helps, Cheers!