This is my file (UTF-8 encoded):
<?xml version="1.0" encoding="UTF-8"?>
<foo>
<bar>Hello World äüö</bar>
</foo>
I would like to use xmllint to produce this result:
<bar>Hello World äüö</bar>
But every command prints encoded unicode characters:
$ xmllint --xpath "//bar" file.xml
<bar>Hello World äüö</bar>
$ xmllint --xpath "//bar" --encode utf-8 file.xml
<bar>Hello World äüö</bar>
$ xmllint --xpath "//bar" --noenc file.xml
<bar>Hello World äüö</bar>
Do you have any idea how to get the unencoded result? (I can not install other tools like xmlstarlet..).
$ xmllint --version
xmllint: using libxml version 20907
compiled with: Threads Tree Output Push Reader Patterns Writer SAXv1 FTP HTTP DTDValid HTML Legacy C14N Catalog XPath XPointer XInclude Iconv ISO8859X Unicode Regexps Automata Expr Schemas Schematron Modules Debug Zlib Lzma
$ locale
LANG=C.utf8
LC_CTYPE="C.utf8"
LC_NUMERIC="C.utf8"
LC_TIME="C.utf8"
LC_COLLATE="C.utf8"
LC_MONETARY="C.utf8"
LC_MESSAGES="C.utf8"
LC_PAPER="C.utf8"
LC_NAME="C.utf8"
LC_ADDRESS="C.utf8"
LC_TELEPHONE="C.utf8"
LC_MEASUREMENT="C.utf8"
LC_IDENTIFICATION="C.utf8"
LC_ALL=
$ cat /etc/*-release
Rocky Linux release 8.8 (Green Obsidian)
Best option seems to be
catinternal shell commandGiven
Sending
cat <xpath expression>to internal shellIssue looks related to
xmllintversion (libxml2version in the end). See details belowUsing
xmllint --shell--noencand no xpath.noenctakes precedence overnoentwhich makes sense, all characters in output are ascii.--noent(looks the default)--xpath-noencis ignored--shell-noencis ignored oncatinternal command and enforced onxpathone.ASCII encoding
lxmlpyhton module is also based on libxml so here's a one liner that does the sametext result
Serialazing without indicating encoding
Serialazing with encoding