why common-io tool IOUtils.toByteArray is not same?

Question

why common-io tool IOUtils.toByteArray is not same?

267 Views Asked by wangxl At 28 July 2025 at 01:55

Why aren't the results the same when using commons.io.IOUtils to get byte[]?

The toByteArray method params are Inputstream and Reader.

String file = "c:/c.pdf";

try (InputStream is = new FileInputStream(file)) {
    byte[] result = IOUtils.toByteArray(is);
    System.err.println(Arrays.toString(result));
} catch (Exception e) {
    e.printStackTrace();
}

try (Reader reader = new FileReader(file)) {
    byte[] result = IOUtils.toByteArray(reader,"gbk");
    System.err.println(Arrays.toString(result));
} catch (Exception e) {
    e.printStackTrace();
}

Original Q&A

There are 1 best solutions below

**Pino** · Answer 1

Short answer: the two results are different because the 2nd solution is wrong. Never use a Reader to read binary data.

An InputStream reads the bytes of a file without trying to give them any meaning; a Reader, on the contrary, tries to convert them to characters using a specific charset: your 2nd example reads bytes, converts them to characters and then the toByteArray() method converts these characters back to bytes BUT this double conversion is not only unuseful (obvious), it's quite wrong because the first conversion may fail: when the Reader encounters a byte (or a group of bytes in case of multi-byte charsets like GBK) that has no associated character it returns a question mark character and when you convert these question marks back to bytes you get the byte value corresponding to the question mark not the original value that failed the conversion.

So the problem is not in IOUtils, it is in your usage of a Reader for reading a PDF.

why common-io tool IOUtils.toByteArray is not same?

There are 1 best solutions below

Related Questions in CHARACTER-ENCODING

Related Questions in FILEREADER

Related Questions in IOUTILS

Trending Questions

Popular # Hahtags

Popular Questions