I'm using xml.etree.ElementTree to parse an XML file.
How can I force it to either strip text of whitespaces (just regular spaces, not  ) or leave spaces and ignore escapes (leave them as is)?
Here is my problem:
xml_text = """
<root>
<mytag>
data_with_space 
</mytag>
</root>"""
root = xml.etree.ElementTree.fromstring(xml_text)
mytag = root.find("mytag")
print "original text: ", repr(mytag.text)
print "stripped text: ", repr(mytag.text.strip())
It prints:
original text: '\n data_with_space \n '
stripped text: 'data_with_space'
What I need:
'data_with_space '
or (which I can escape by other means):
'data_with_space '
A solution using xml.etree.ElementTree is preferable because I'd have to rewrite a whole lot of code otherwise
The standard XML library treats
 and' 'as equal. There's no way to avoid the equalization if you directly applyfromstring(xml_text), and therefore it's impossible to differentiate them then. The only way to stop the escaping is to translate it into something else before applyfromstring(), and translate it back after then.You would get: