LZW- Compression in Python

1.1k Views Asked by At

For a university course work exam, I need to write code that computes a LZW compression on a string, with specific requirements, then translates via Elias coding into numbers.

We are allowed to use the internet/friends/other examples to do this code, so this is all allowed.

The specs are:

Assign an initial integer code to each possible character. This is the 
initial codebook

Keep a buffer of the last remembered substring, which is initially empty.

Iterate through each character of the string to compress.

Keep appending characters to the buffer until there is no codebook
entry for the buffer with the new character appended. When this
happens:

    Add a new code for the buffer to the codebook.

    Append the integer code for the buffer without the new character to
   the compressed output. This code will already be in the codebook.
    (think about why this must be true).

    Set the buffer to be the new character on its own.

Compression worked example: For example, if the string "AABAABAA" was being compressed, then:

1. A would have a known code.
2. AA would not have a known code. A new shorthand code would be created,
and the code for A would be output.
3. AB would not have a known code. A new shorthand code would be created
and the code for A would be output.
4. BA would not have a known code. A new shorthand code would be created
and the code for B would be output.
5. the code for AA would be known (we created it at step 2)
6. the code for AAB would not be known. A new shorthand code for AAB would
be created and the code for AA output.
7. the code for BA would be known from step 4.
8. the code for BAA would not be known. A new shorthand code for BAA would 
be created and the code for BA output.
9. A is the last remaining character in the string, so the code would be 
output directly.

Output

The result of this compression will be a a sequence of positive integers representing the compressed version of the string. (the compressed output). The codes in the codebook must be positive numbers, ordered sequentially without any gaps, starting from 0.

How would I start such a code?

Thanks!

0

There are 0 best solutions below