Python xml: encode numeric character references in hex form

485 Views Asked by At

I have a number of scripts which get external data and update parts of xml files.

I use lxml in my python script and it saves character references in decimal notation, for example:

$ cat input.xml
<data>
  <record text="&#x41f;&#x440;&#x438;&#x432;&#x435;&#x442;">
  </record>
</data> 

$ python
>>> from lxml import etree
>>> tree = etree.parse("input.xml")
>>> tree.write("out.xml")

$ cat out.xml
<data>
  <record text="&#1055;&#1088;&#1080;&#1074;&#1077;&#1090;">
  </record>
</data>

While other scripts use hex form: <record text="&#x41f;&#x440;&#x438;&#x432;&#x435;&#x442;"> and so there are endless series of changes in git for these files even if there are no actual changes.

How can I tell lxml to save character references in hex form (&#x41f;) in a python script?

0

There are 0 best solutions below