I have parsed an xml file using the xmltodict module and the result is stored in a dictionary of dictionaries.
Now I want to remove the special characters @ and # in every key of the dictionary.
def remove_using_json(parse_result):
data = {}
data = json.dumps(parse_result)
#print data
#for d in data:
for key, value in data.iterkeys():
if key[0] == '@':
data[key]=key.strip("@")
elif key[0] == '#':
data[key] =key.strip("#")
There is no direct way to eliminate those during parsing as they are used to denote attributes and text nodes allowing them to be distinguished from elements (if they weren't there the output would be unusable).
For example
produces a nested ordered dict with the structure
The @ symbol tells me that the @id is an attribute. Without that symbol, I couldn't tell if it was an attribute or an element named id. Similarly, the # symbol tells me that #text is the text value of that element. Without that I couldn't tell if it was the element's text, or if it was an element named text.
However, when dealing with the keys, you can strip them using
ky[1:]wherekyis the key.For example, if I assign the above parsed output to the variable
doc, I can do1Which would output
where I have stripped the @ symbol from the attribute.
If you really want to remove these symbols completely from the parsed value, you can write a recursive function to do this.
Thus in the above,
remover(doc)would remove all of the @ and # symbols from the keys. The behavior may be unstable and will lose some data if any node has an element and attribute with the same name or either an element or attribute named text, which is precisely why those symbols are there in the first place. This function does modify the object in place, and thus, if the original needs to be preserved, a deepcopy should be made and passed to the function.1 This uses python 3 syntax, where the print command is a function. To make this example work in python 2.6 or 2.7, first issue
from __future__ import print_functionor change the print function calls to statements likeprint "Attribute: "+ky[1:].