Using a huffman tree to decode binary text

403 Views Asked by At

The snippet below is not returning the correct text. The code takes in a pointer to the root node of a Huffman code tree and a binary text, which it then converts. However, every time it returns a single letter repeated.

string decode(Node *root, string code) {
    string d = ""; char c; Node *node = root;
    for (int i = 0; i < code.size(); i++) {
        node = (code[i] == '0') ? node->left_child : node->right_child;
        if ((c = node->value) < 128) {
            d += c;
            node = root;
        }
    }
    return d;
}

The code for the Node object:

class Node {
public:
  Node(int i, Node *l = nullptr, Node *r = nullptr) {
      value = i;
      left_child = l;
      right_child = r;
  }
  int value;
  Node *left_child;
  Node *right_child;
};

The code for building the tree:

Node* buildTree(vector<int> in, vector<int> post, int in_left, int in_right, int *post_index) {
    Node *node = new Node(post[*post_index]);
    (*post_index)--;

    if (in_left == in_right) {
        return node;
    }

    int in_index;
    for (int i = in_left; i <= in_right; i++) {
        if (in[i] == node->value) {
            in_index = i;
            break;
        }
    }

    node->right_child = buildTree(in, post, in_index + 1, in_right, post_index);
    node->left_child = buildTree(in, post, in_left, in_index - 1, post_index);

    return node;
}

Example tree:

        130
       /   \
     129    65
    /   \
  66     128
        /   \
      76     77 

Example I/O: Input: 101010010111 Output: A�A�A��A�AAA

The diamond characters are the numbers greater than 128.

1

There are 1 best solutions below

1
On

You are putting the value in a char, which for most C++ compilers is signed. But not all -- whether char is signed or unsigned is implementation defined. A signed char is in the range –128 to 127, so it is always less than 128. (Your compiler should have warned you about that.)

You need to use int c; instead of char c; in decode(), and do d += (char)c;. Then your first code snippet will correctly return ALABAMA.

By the way, there needs to be an error check in decode(), verifying that you exit the loop with node equal to root. Otherwise, there were some bits provided that ended in the middle of a code, and so were not decoded to a symbol.