I have been reading and learning hashing and hashtables and experemented with some code(I am still very new to this so I might say something wrong that I missunderstood). I came to the issue for perfect hash functions. Provided that I have my own custom type that somehow has a perfect hash function:
class Foo
{
private int data;
override int GetHashCode()
{
return data.GetHashCode();
}
}
An int
's hash code is the int
itself so I have a perfect hash function, right? But when we use the hashing function to map the objects to a hashtable by the simple formula:
index = foo.GetHashCode() % hashtable.Length
we get a variable index that depends on also how many elements we have in the hashtable. If the hashtable's size was int.MaxValue only then we will have a perfect hash function. For example lets say that we have a hashtable with size of 2. And if we hash for example the numbers 1 and 3 we get
1 % 2 = 1
3 % 2 = 1
A collision! Have I understood anything wrong about hashing and hashtables? It comes out that a perfect hash function is not perfect.
Yes! (as said, by definition)
Where do you get a p.h.f from in the first place? You want to hash a fixed, i.e. constant set S of different (i.e. no multiset) values to the set 1..|S|, bijectively. Apparently then, the p.h.f depends on the set S.
Also, remove a single element from S, and add another one, you almost surely get a collision (of the new element with an old one).
So, you actually want "a p.h.f. for such-and-such well defined/described set". And then we can try to find one.