Canonical Huffman Encoder : Contents of Encoded Bitstream

405 Views Asked by At

Let's say we have the following canonical Huffman code table.

Symbol    Code-length   Codeword
 A            2          00
 B            2          01
 C            2          10
 D            2          11

Now, we read the symbols from a input file and encode it by just looking at the above table. However, many resources say in case of canonical Huffman we should not send the codewords. Instead, code length for each symbol is enough.

If a text file contains ACCDB, should I transmit 00 01 10 11 or 10 10 10 10 (binary equivalent of corresponding code length) as encoded bit stream? Please rectify me if I am wrong and I appreciate any explanation.

Moreover, if that is the case for canonical Huffman, how would we decode that bit stream to get back original symbols ACCDB (without using Huffman tree at decoder)?

1

There are 1 best solutions below

3
On

That is not a canonical Huffman code table, nor is that a Huffman code, nor is that a prefix code. The code lengths 1, 2, 2, 3 oversubscribe the available bits. 1, 2, 2 is a complete code, allowing no more symbols to be coded.

1, 2, 3, 3 is a complete and not-oversubscribed code, in which case an example of the codes would be 0, 10, 110, 111. You can see that those codes can be decoded uniquely, reading them left to right.