Gzip decompression adding one extra byte ... Why?

385 Views Asked by At

I've written a simple Java code snippet which takes a String, converts it to byte[], and then compresses it using Gzip. Then it decompresses the result to get back the byte[], which now contains one extra garbage value byte. Why is there a garbage value byte here ??

public static void main(String[] args) throws Exception {

String testString = "Sample String here";
byte[] originalBytes = testString.getBytes();

ByteArrayOutputStream baos = new ByteArrayOutputStream();
GZIPOutputStream gzos = new GZIPOutputStream(baos);
gzos.write(originalBytes);
gzos.close();

byte[] compressedBytes = baos.toByteArray();

ByteArrayInputStream bais = new ByteArrayInputStream(compressedBytes);
GZIPInputStream gzis = new GZIPInputStream(bais);

ByteArrayOutputStream dbaos = new ByteArrayOutputStream();
while(gzis.available() > 0) {
    dbaos.write(gzis.read());
}
byte[] decompressedBytes = dbaos.toByteArray();
String decompressedString = new String(decompressedBytes);

System.out.println(">>" + decompressedString + "<<");
System.out.println("Size of bytes before: " + originalBytes.length);
System.out.println("Size of bytes after: " + decompressedBytes.length);

}

Output:

>>Sample String here�<<
Size of bytes before: 18
Size of bytes after: 19

Can someone tell me why is there a garbage value byte ? How do I get rid of it WITHOUT changing the setup of the code above ??

1

There are 1 best solutions below

4
On

You are using available() here, so you get one extra byte. You should be reading the stream and checking for a value less than 0. Change this

ByteArrayOutputStream dbaos = new ByteArrayOutputStream();
while(gzis.available() > 0) {
    dbaos.write(gzis.read());
}

to something like

ByteArrayOutputStream dbaos = new ByteArrayOutputStream();
int b;
while ((b = gzis.read()) >= 0) {
    dbaos.write(b);
}

and I get

>>Sample String here<<
Size of bytes before: 18
Size of bytes after: 18