Custom implementation of a HashTable in Java?

1.5k Views Asked by At

I was solving the Quora problem and for my particular solution I needed a hashtable (long-keys, int-values) for caching values. I hoped that the Java HashMap could be improved because I knew my data types for the keys and values and they were primitives and also my problem space. I decided to naively go ahead implement a simple hashtable using the "array-of-linkedlist" structure (even my linkedList was my own Node class I had implemented). But I noticed that my own naive implementation was about 4x slower than the generic Java HashMap. I also tried to use Trove's LongToIntMap library to see what they do. Does anyone have any good suggestions to build a custom Long to Int hashtable in Java that significantly outperforms the Java HashMap?

2

There are 2 best solutions below

0
On

Have a look at Javolution's FastMap. Source code is available here.

2
On

I also tried to use Trove's LongToIntMap library to see what they do.

Did you try looking at the code to see how they do it?


It is not possible to say say for sure what you did wrong in your implementation without looking at the code. However one possible improvement might be to replace the LinkedList<Integer> with a custom "list of integer" type that uses int[] to represent the lists. Depending on your hash table API, you should be able to avoid the cost of representing your values as objects (specifically Integers). (And as a corollary, you will get better performance and space utilization by NOT implementing an API that has generic types for the key and/or value types.)

For what it is worth, one mistake that could lead to poor performance is neglecting to implement hashtable resizing. Without resizing, the complexity of get and put operations on the table will be O(N) rather than O(1) ... because the hash chain lengths will grow in proportion to the number of hash table entries.

Finally, you need to be clear in your mind whether you are optimizing for performance or space utilization. The optimal solution will be different ....