Consider the following:
String s = "<tag>This has a <a href=\"#\">link<a>.</tag>";
final XML xml = new XMLDocument(s);
String extractedText = xml.xpath("//tag/text()").get(0);
System.out.println(extractedText); // Output: This has a <a href="#">link</a>.
System.out.println(s.contains(extractedText)); // Output: false!
System.out.println(s.contains("This has a <a href=\"#\">link<a>.")); // Output: true
I have an XML file given as a string with some escaped HTML. Using the jcabi library, I get the text of the relevant elements (in this case everything in <tag>
s). However, what I get isn't actually what's in the original string--I'm expecting <
and >
but am getting <
and >
instead. The original string paradoxically does not contain the substring that I extracted from it.
How can I get the actual text and not an unescaped version?