I am trying to import urls like this one (http://musicbrainz.org/ws/2/artist/72c536dc-7137-4477-a521-567eeb840fa8) into python and extract the value of the "gender".
import urllib2
import codecs
import sys
import os
from xml.dom import minidom
import xml.etree.cElementTree as ET
#urlbob = urllib2.urlopen('http://musicbrainz.org/ws/2/artist/72c536dc-7137-4477-a521-567eeb840fa8')
url = 'dylan.xml'
#attempt 1 - using minidom
xmldoc = minidom.parse(url)
itemlist = xmldoc.getElementsByTagName('artist')
#attempt 2 - using ET
tree = ET.parse('dylan.xml')
root = tree.getroot()
for child in root:
print child.tag, child.attrib
I can't seem to get at gender either via the mini-dom stuff or the etree stuff. In its current form, the script returns
{http://musicbrainz.org/ns/mmd-2.0#}artist {'type': 'Person', 'id': '72c536dc-7137-4477-a521-567eeb840fa8'}
That is because you're looping
root
which is only the root of the tree, does that make sense? When you loop the root, it will only return the next child and stop there.You need to loop the iterable so it will get returning next node and get the result, see this:
Results: