I am using python to do some conditional changes to an XML document. The incoming document has <?xml version="1.0" ?> at the top.
I'm using xml.etree.ElementTree.
How I'm parsing the changed XMl:
filter_update_body = ET.tostring(root, encoding="utf8", method="xml")
The output has this at the top:
<?xml version='1.0' encoding='utf8'?>
The client wants the "encoding" tag removed but if I remove it then it either doesn't include the line at all or it puts in encoding= 'us-ascii'
Can this be done so the output matches: <?xml version="1.0" ?>?
(I don't know why it matters honestly but that's what I was told needed to happen)
As pointed out in this answer there is no way to make ElementTree omit the encoding attribute. However, as @James suggested in a comment, it can be stripped from the resulting output like this:
The
bprefixes are required becauseET.tostring()will return abytesobject ifencoding != "unicode". In turn, we need to callbytes.replace().With
encoding = "unicode"(note that this is the literal string "unicode"), it will return a regularstr. In this case, thebs can be omitted. We use good oldstr.replace().It's worth noting that the choice between
bytesandstralso affects how the XML will eventually be written to a file. Abytesobject should be written in binary mode, astrin text mode.