Pretty printing is failing with lxml when I add an element to an empty element

41 Views Asked by At

The pretty printing of lxml module fails when I add an element to an empty XML element :

Here is my python code :

import lxml.etree

xml_parser = lxml.etree.XMLParser(remove_blank_text=True)
xml_tree = lxml.etree.parse("myFile.xml", xml_parser)
root = xml_tree.getroot()

for elem in root:
    if elem.tag == "apple":
        lxml.etree.SubElement(elem, "McIntosh")

tree = lxml.etree.ElementTree(root)
tree.write("output.xml", pretty_print=True, xml_declaration=True, encoding="utf-8")

Here is my XML input file :

<fruits>
  <apple>
  </apple>
</fruits>

And here is my output XML file :

<?xml version='1.0' encoding='UTF-8'?>
<fruits>
  <apple>
  <McIntosh/></apple>
</fruits>

My code works if my XML element is not empty tho :

<?xml version='1.0' encoding='UTF-8'?>
<fruits>
  <apple>
    <golden/>
    <McIntosh/>
  </apple>
</fruits>

I will have to implement a custom xml formatter if I don't find a solution.

Does anyone know how to fix this problem ?

1

There are 1 best solutions below

1
I like Bananas On BEST ANSWER

Use the indent function to fix it. As an added bonus this lets you customize the indent

import lxml.etree

xml_parser = lxml.etree.XMLParser(remove_blank_text=True)
xml_tree = lxml.etree.parse("myFile.xml", xml_parser)
root = xml_tree.getroot()

for elem in root:
    if elem.tag == "apple":
        lxml.etree.SubElement(elem, "McIntosh")

lxml.etree.indent(xml_tree, '    ')

tree = lxml.etree.ElementTree(root)
tree.write("output.xml", pretty_print=True, xml_declaration=True, encoding="utf-8")

here's the output:

<?xml version='1.0' encoding='UTF-8'?>
<fruits>
    <apple>
        <McIntosh/>
    </apple>
</fruits>