I'd like to use XMLStreamReader for reading a XML file which contains Horizontal Tab ASCII Codes 	, for example:
<tag>foo	bar</tag>
and print out or write it back to another xml file.
Google tells me to set javax.xml.stream.isCoalescing to true in XMLInputFactory, but my test code below does not work as expected.
public static void main(String[] args) throws IOException, XMLStreamException {
XMLInputFactory factory = XMLInputFactory.newInstance();
factory.setProperty(factory.IS_COALESCING, true);
System.out.println("IS_COALESCING supported ? " + factory.isPropertySupported(factory.IS_COALESCING));
System.out.println("factory IS_COALESCING value is " +factory.getProperty(factory.IS_COALESCING));
String rawString = "<tag>foo	bar</tag>";
XMLStreamReader reader = factory.createXMLStreamReader(new StringReader(rawString));
System.out.println("reader IS_COALESCING value is " +reader.getProperty(factory.IS_COALESCING));
PrintWriter pw = new PrintWriter(System.out, true);
while (reader.hasNext())
{
reader.next();
pw.print(reader.getEventType());
if (reader.hasText())
pw.append(' ').append(reader.getText());
pw.println();
}
}
The output is
IS_COALESCING supported ? true
factory IS_COALESCING value is true
reader IS_COALESCING value is true
1
4 foo bar
2
8
But I want to keep the same Horizontal Tab like:
IS_COALESCING supported ? true
factory IS_COALESCING value is true
reader IS_COALESCING value is true
1
4 foo	bar
2
8
What am I missing here? thanks
From what I see, the parsing part is correct - it's just not printed as you envision it. Your unicode encoding is interpreted by the XML reader as
\tand represented accordingly in Java.Using Guava's XmlEscapers, I can produce something similar to what you want to have:
The Output looks like this:
Some remarks to this:
\tdoes not need to be escaped in XML content, thus I had to choose the attribute converter. While it works, there might be some side effectsCDATA?