Java jcabi xpath returns unescaped text

110 Views Asked by At

Consider the following:

String s = "<tag>This has a &lt;a href=\"#\"&gt;link&lt;a&gt;.</tag>";
final XML xml = new XMLDocument(s);
String extractedText = xml.xpath("//tag/text()").get(0);
System.out.println(extractedText); // Output: This has a <a href="#">link</a>.
System.out.println(s.contains(extractedText)); // Output: false!
System.out.println(s.contains("This has a &lt;a href=\"#\"&gt;link&lt;a&gt;.")); // Output: true

I have an XML file given as a string with some escaped HTML. Using the jcabi library, I get the text of the relevant elements (in this case everything in <tag>s). However, what I get isn't actually what's in the original string--I'm expecting &lt; and &gt; but am getting < and > instead. The original string paradoxically does not contain the substring that I extracted from it.

How can I get the actual text and not an unescaped version?

0

There are 0 best solutions below