StringEscapeUtils escapeJava is escaping pound signs

7.5k Views Asked by At

I'm trying to escape a string to ensure that special characters are escaped.

Using

StringEscapeUtils.escapeJava("")  escapes to \\uD83D\\uDE00

StringEscapeUtils.escapeJava("% ! @ $ ^ & * ") doesn't escape any of the characters

StringEscapeUtils.escapeJava("£") escapes to \\u00A3

I can understand that emojis contain backslashes and so are escaped, but why is the pound sign being escaped, and how do I stop it from being escaped?

1

There are 1 best solutions below

1
On BEST ANSWER

The documentation of StringEscapeUtils.escapeJava() is vague on exactly what "Java String rules" are.

I guess it is referring to the bit in JLS Chapter 3, where it says:

Programs are written in Unicode (§3.1), but lexical translations are provided (§3.2) so that Unicode escapes (§3.3) can be used to include any Unicode character using only ASCII characters.

and

ASCII (ANSI X3.4) is the American Standard Code for Information Interchange. The first 128 characters of the Unicode UTF-16 encoding are the ASCII characters.

So it might mean escaping the string so that it can be written using only ASCII characters.

%, !, @, $, ^, & and * are all ASCII characters. They have values less than 128 (i.e. they are in the 7-bit block).

£ isn't an ASCII character: in ISO8859-1, it is encoded as 163 (0xA3), which is outside the 7-bit ASCII block.

If you open a file with the £ in a string literal, it might be rendered as something else, if that editor doesn't set the character encoding correctly. For example, it could be Ł, if it's interpreted in ISO8859-2.

In order to be unambiguous, the pound sign is therefore escaped.

how do I stop it from being escaped

You can't, using this method; you'd need to find an alternative. The only thing you can do would be to replace the \u00A7s in the string with £ again.