Java CRC error when using a dictionary with GZIP

909 Views Asked by At

This is honestly frustrating because I think I know the cause but at the same time I cannot pinpoint when it is happening in my code. Basically, for this assignment, we're supposed to read in an input stream, split it into 128 byte blocks, and compress each block while using the last 32 bytes from the previous block as a dictionary.

import java.io.*;
import java.util.zip.*;

public class TestCase
{
    protected static final int BLOCK_SIZE = 128;
    protected static final int DICT_SIZE = 32;

    public static void main(String[] args)
    {
        BufferedInputStream inBytes = new BufferedInputStream(System.in);
        byte[] buff = new byte[BLOCK_SIZE];
        byte[] dict = new byte[DICT_SIZE];
        int bytesRead = 0;

        try
        {
            DGZIPOutputStream compressor = new DGZIPOutputStream(System.out);
            bytesRead = inBytes.read(buff);

            if (bytesRead >= DICT_SIZE)
            {
                System.arraycopy(buff, 0, dict, 0, DICT_SIZE);
            }

            while(bytesRead != -1) 
            {
                compressor.write(buff, 0, bytesRead);              
                if (bytesRead == BLOCK_SIZE)
                {
                    System.arraycopy(buff, BLOCK_SIZE-DICT_SIZE, dict, 0, DICT_SIZE);
                    compressor.setDictionary(dict);
                }

                bytesRead = inBytes.read(buff);
            }
            compressor.flush();         
            compressor.close();
        }
        catch (IOException e)
        {
            e.printStackTrace();
        System.exit(-1);
        }
    }

    public static class DGZIPOutputStream extends GZIPOutputStream
    {
        public DGZIPOutputStream(OutputStream out) throws IOException
        {
            super(out);
        }

        public void setDictionary(byte[] b)
        {
            def.setDictionary(b);
        }

        public void updateCRC(byte[] input)
        {
            crc.update(input);
            System.out.println("Called!");
        }                       
    }
}

I'm literally off by one single byte. I think it's that when I call write(), I know afterwards it updates the crc for the byte array. I THINK for some reason updateCRC is being called twice but I cannot for the life of me figure out where. Or maybe I'm off completely. But it's this one single byte and yet when I take off the dictionary, it works just fine so....I'm really not sure.

EDIT: So when I compile and test it:

$cat file.txt

hello world, how are you? 123efd4
KEYBOARDSMASHR#@Q)KF@_{KFSKFDS
000000000000000000000000000000000000000000000000000
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
pwfprejgewojgw
12345678901234567890
!@#$%^&*(!@#$%^&*(A

cat file.txt | java TestCase | gzip -d | cmp file.txt ; echo $?

gzip: stdin: invalid compressed data--crc error
file.txt - differ: byte 1, line 1
1

(ignore my choice of file, I was sleepy last night)

EDIT: Solved

0

There are 0 best solutions below