How to design an efficient algorithm to support random access in LZ77 compressed sequences?

203 Views Asked by At

I'm working on a project in text compression and I need to design an efficient algorithm in a LZ77 compressed sequences. Specially, Given a LZ77 compressed sequence and an index i, we can recover a single symbol S[i] of the input sequence S. The space consumed by the algorithm and the time of random access to a symbol are what we pursue。

Thanks in advance for your suggestions.

1

There are 1 best solutions below

5
On

See zran.c for an example. It builds an index with entry points into a gzip or zlib stream. You can select approximately how close the entry points are to each other, in terms of the uncompressed data. To get to a random byte of uncompressed data, you start decompressing at the closest entry point that precedes that byte, and decompress until you get to that byte.

The tradeoff is less storage for fewer entry points, but a longer time to get to any given byte, vs. more storage and less time.