xmltodict: ExpatError: syntax error: line 1, column 0, getting xml from QuickGO

3.6k Views Asked by At

I'm finding myself a bit confused by the error I get from the following

from future.standard_library import install_aliases
install_aliases()
from urllib.request import urlopen
import xmltodict

oboxml = urlopen('http://www.ebi.ac.uk/QuickGO/GTerm?id=GO:0003723&format=oboxml')
obo_dict = oboxml.read()
obo_dict_parse = xmltodict.parse(obo_dict)

which returns the error:

ExpatError Traceback (most recent call last) in 6 oboxml = urlopen('http://www.ebi.ac.uk/QuickGO/GTerm?id=GO:0003723&format=oboxml') 7 obo_dict = oboxml.read() ----> 8 obo_dict_parse = xmltodict.parse(obo_dict)

~/anaconda3/lib/python3.6/site-packages/xmltodict.py in parse(xml_input, encoding, expat, process_namespaces, namespace_separator, disable_entities, **kwargs) 328 parser.ParseFile(xml_input) 329 else: --> 330 parser.Parse(xml_input, True) 331 return handler.item 332

ExpatError: syntax error: line 1, column 0

The xml file is being returned, so its not empty. But I'm not sure why the xml parser would be unhappy particularly as I'm following a tutorial notebook*. I'm using xmltodict 0.11.0-py36_1, installed and updated through anaconda.

Any pointers much appreciated

*Following notebook: goatools https://link.springer.com/protocol/10.1007/978-1-4939-3743-1_16

1

There are 1 best solutions below

0
On

From what I can gather:

Quickgo doesn't provide xml format. It is not listed as a response type in their API https://www.ebi.ac.uk/QuickGO/api/index.html#!/annotations/downloadLookupUsingGET

I tried an alternative fetch method using the Bioservices package and its QuickGo tools. It seems this package has diverged from its documentation, i.e. where once:

term = qgo.Term("GO:0000016", frmt="oboxml")

it no longer does. Term has been replaced with Terms, and it no longer accept 'frmt=' or 'format='.

The closest I can currently get is:

from bioservices import QuickGO as qgo
qgo = qgo()
term = qgo.Terms("GO:0000016")
print(term, type(term))

which will return a list.

An xml file was desired, for use with the package goatools. I will edit the original question to clarify this in case others come across the same problem.