Print actual characters instead of NCR's

116 Views Asked by At

I'd like to print an XML document without reducing all of the unicode containing in it to ugly NCRs. Here's a sample:

use XML::LibXML;
my $parser = XML::LibXML->new();
my $doc = $parser->load_xml(string => '<xml>FULL WIDTH</xml>');
print $doc->toString();

This prints the following:

<?xml version="1.0"?>
<xml>&#xFF26;&#xFF35;&#xFF2C;&#xFF2C; &#xFF37;&#xFF29;&#xFF24;&#xFF34;&#xFF28;</xml>

Very, very ugly and difficult to read (unless viewed in a browser or something).

How can I get the document to print real characters, and to have a utf-8 (or whatever other encoding) declaration?

1

There are 1 best solutions below

2
On BEST ANSWER

The object type returned by XML::LibXML::Parser is XML::LibXML::Document, which has a setEncoding method:

$doc->setEncoding('utf-8');

Now the script prints this:

<?xml version="1.0" encoding="utf-8"?>
<xml>FULL WIDTH</xml>