Radix tree vs Hashtable

38 Views Asked by At

I wanted to compare the hash table vs radix tree. For fixed length keys, for sure, hash table is always the best option, considering that we don't care about sorted order, but for variable length keys, things are much complicated, so the question will assume variable length keys.

Time complexities:

  1. Hashtable insertion time is O(n) due to hash function.
  2. Radix tree insertion time is said to have O(n) as well.

I tried the following where data1 is "testtesttest...." and data2 is "testtesttest...2"(they are 40000 lengths each and data2 only has one string different in the end from data1). For hash table insertion, the only time consuming is hash function so I just use hash function, even though this is not the has function most languages use.

console.time();
radixTree.addWord(data1);
radixTree.addWord(data2);
console.timeEnd();

console.time();
keccak256(data1);
keccak256(data2);
console.timeEnd();

Basically, adding 2 such strings in a radix tree takes 9ms whereas in hash functions take 2ms. You can add extra overhead for actually storing it in hashtable(since I'm not using it here), but the most it will take is 3ms. So 9ms vs 3ms. Why does it say that both algorithm's insertion time is O(n)?

If we don't care about the ordered set and just building a dictionary, I don't ever see raddix tree winning over hashtable in terms of performance. In terms of memory, first creation of hashtable might take more memory but in terms of time complexities, hashtable always results in the better performance. Your thoughts?

Radix tree is only good if you compare it with other trees and when you need an ordered list, isn't this true?

1

There are 1 best solutions below

0
Matt Timmermans On

Common reasons for using some kind of radix tree (usually large fan-out B-tree variant) instead of a hash table for variable-length keys:

  • You need the ordering (as you've noted)
  • Hash table complexities are "expected". Sometimes you need to bound your real worst-case complexity.
  • As hash tables grow, you sometimes need to reallocate the whole thing. Sometimes that's unacceptable. You can make hash tables that grow incrementally, but it's a pain and wastes memory.
  • On block-structured media like disk or flash, B-tree-like data structures are often used to keep track of the blocks that make up a file. If you can go directly to the media, then you can make a radix tree where you map your B-tree blocks directly to media blocks, and you don't need this extra B-tree layer. A radix tree can actually do less pointer-following than a hash table.