Escape html in java using entity numbers instead of names

198 Views Asked by At

Java can escape html using e.g. Apache CommonsText StringEscapeUtils but it will use html entity names, not numbers. So > becomes > Is there a library or class or some feasible custom approach that will use html entity numbers, so > becomes > and becomes € etc?

(Context: I'm using a platform where escaped html using entity names is no longer accepted in a certain process for security reasons, so I really need the entity numbers).

2

There are 2 best solutions below

0
Reilas On

There doesn't appear to be.

The Apache Commons Text library maps the values directly to the names.
Apache Commons Text - EntityArrays.java

You can create your own implementation, using the basic set, ", &, <, and >.

String escape(String string) {
    StringBuilder escaped = new StringBuilder();
    for (char character : string.toCharArray()) {
        switch (character) {
            case '"', '&', '<', '>' -> {
                escaped.append("&#%d;".formatted((int) character));
            }
            default -> escaped.append(character);
        }
    }
    return escaped.toString();
}

Input

<html>stack overflow</html>

Output

&#60;html&#62;stack overflow&#60;/html&#62;

If you're looking to create an escape pattern which will encapsulate a much broader range of values, you can implement a range parameter.
Note, the ranges here are the allowed code-points, not the to-be-escaped code-points.

record Range(int start, int limit) { }

/** @param ranges allowed code-points */
String escape(String string, Range... ranges) {
    StringBuilder escaped = new StringBuilder();
    int codePoint;
    boolean allowed;
    for (char character : string.toCharArray()) {
        codePoint = (int) character;
        allowed = false;
        for (Range range : ranges) {
            if (codePoint >= range.start && codePoint <= range.limit) {
                allowed = true;
                break;
            }
        }
        if (allowed) escaped.append(character);
        else escaped.append("&#%d;".formatted(codePoint));
    }
    return escaped.toString();
}
2
Adarsh Shridhar On

You can also use the method present in org.springframework.web.util.HtmlUtils:

//Send a input string in parameter
HtmlUtils.htmlEscapeDecimal(String input)

I hope it will work for you as you expected.