Java String to UCS2 encoding for Letters with Accents

1.1k Views Asked by At

I have a requirement for encoding a String that contains foreign characters eg. letters with accents to UCS2 characters and have the following piece of code working for normal english letters.

String encodeAsUCS2(String test) throws UnsupportedEncodingException{

        byte[] bytes = test.getBytes("UTF-16BE");

        StringBuilder sb = new StringBuilder();
        for (byte b : bytes) {
            sb.append(String.format("%02X", b));
        }

        return sb.toString();


    }

That outputs hexadecimal sequence of UCS2/UTF16 bytes

eg. hello = 00680065006C006C006F

It runs into an issue with the letters that have accents/foreign characters and displays the value as FFFD which is in the Specials table and is used to indicate problems when a system is unable to render a stream of data to a correct symbol.

Any work around for this?

0

There are 0 best solutions below