I am reading up on the LZW algorithm and I have an implementation like so;
def compress(data):
dictionary = {chr(i): i for i in range(256)}
next_code = 256
result = []
sequence = ""
for char in data:
new_sequence = sequence + char
if new_sequence in dictionary:
sequence = new_sequence
else:
result.append(dictionary[sequence])
dictionary[new_sequence] = next_code
next_code += 1
sequence = char
if sequence:
result.append(dictionary[sequence])
return result
def decompress(data):
dictionary = {i: chr(i) for i in range(256)}
next_code = 256
result = []
sequence = chr(data[0])
result.append(sequence)
for code in data[1:]:
if code in dictionary:
entry = dictionary[code]
elif code == next_code:
entry = sequence + sequence[0]
else:
raise ValueError("Invalid compressed data")
result.append(entry)
dictionary[next_code] = sequence + entry[0]
next_code += 1
sequence = entry
return "".join(result)
However the output from LZW is an array of integers. I'd assume if one wanted to use this in practice, you would instead want to use bytes as you can then save these bytes to disk. However when you now have an array of bytes, how would you then read these bytes to decompress?
tl;dr: How do you take the int array output from lzw, save and read it from disk as bytes?