Biopython comes with methods to interface with the PAML package for phylogenetic analysis.
In particular I am using Bio.Phylo.PAML to run analyses using PAML's codeml.exe program which in my case does Ka/Ks (dN/dS) ratio analysis on pairs of orthologous gene sequences.
After running the analysis using results = cml.run()
I can see it has successfully generated result.out
files that look about right. Most importantly the final line of the file is what I need to parse into Python:
t= 0.2173 S= 703.9 N= 1489.1 dN/dS= 0.2247 dN = 0.0344 dS = 0.1529
What I am most interested in is dN/dS = 0.2247
According to Biopython's PAML wiki this value can be obtained from Python by doing results = cml.run()
which generates a dictionary with a set of values I am interested in after running the analysis. The wiki claims I can find the values I need in a key called 'parameters'
. But this only returns one of the values I need t= 0.2173
, look:
>>> results.values()
['Fcodon', 'One dN/dS ratio for branches, ', '4.7b', {0: {'description': 'one-ratio', 'parameters': {'t': 0.1982}}}, {'htlv': {}, 'stlv': {}}]
Notice, that my parameters key only contains the t= 0.2173
and has omitted S= 703.9 N= 1489.1 dN/dS= 0.2247 dN = 0.0344 dS = 0.1529
Could anybody with codeml
experience explain to me why the parser fails to yield most of the parameters (values) I am interested in?
Extra details
- Using Python2.7, PAML4.7a
- Running on Windows 7
- I will readily edit in any data or info you require to help me fix this
Cross-posting from https://www.biostars.org/p/89848/:
You can obtain omega values for each branch using the following commands:
This gives you a list of omega (dn/ds) for each branch
using version 4.7b and biopython-1.6.3