I am currently storing a String as an array of bytes. However, when I try to use the following code to convert the bytes back to a String using Charset, I have diamonds at the end:
byte[] testbytes = "abc123".getBytes(); // tried getBytes("UTF-8"/StandardCharsets.UTF_8) too
Charset charset = Charset.forName("UTF-8"); // ISO-8859-1 has no diamonds
CharBuffer charBuffer = charset.decode( ByteBuffer.wrap( Arrays.copyOfRange(testbytes,0,testbytes.length) ) );
System.out.println("converted = " + String.valueOf(charBuffer.array()) );
// returns this - abc123����������
If I set the encoding to ISO-8859-1 instead, it converts fine. I thought it might be the encoding of the source code file but opening that in Notepad++ suggests it is also in UTF-8.
Am I missing something or is this just a problem with Android Studio's Logcat window?
- Edit 1 -
Further testing shows that 3 character strings do not have this padding at the end problem. If you use longer strings, Charset.decode seems to pad out the char array with \u0000 values according to the break point.
String.valueOf will end up printing the padded characters as diamonds while creating a new String object removes the padding but, I would like to not use String at all to convert a byte array to a char array due to sensitive values.
- Edit 2 -
It appears the above happens if you call charset.decode() again so, I'm guessing there's a buffer that's being appended to but not sure at what point. Tried clearing with charBuffer.clear() but the second block of code's output appears to be the same i.e. 3 char + 2 spaces + 6 new chars.
String test1 = "123";
byte[] test1b = test1.getBytes();
char[] expected1 = test1.toCharArray();
CharBuffer charBuffer = charset.decode( ByteBuffer.wrap( test1b ) );
char[] actual1 = charBuffer.array(); // size 3, correct
String test2 = "123456";
byte[] test2b = test2.getBytes();
char[] expected2 = test2.toCharArray();
CharBuffer charBuffer2 = charset.decode( ByteBuffer.wrap( test2b ) );
char[] actual2 = charBuffer2.array(); // size 11, padded with '\u0000' 0
Did you try to use the
String
constructor that receives an array of bytes? Like:Maybe it can solve your problem.