LSH and minhasing - Why does hashing the signature matrix make sense?

50 Views Asked by At

I'm learning about LSH and minhashing and I'm trying to understand the rational of hashing the signature matrix:

We divide the signature matrix to bands and we hash (using which hash function?) every portion of column to k buckets. Why would it make sense? If we use a regular hash function then even a slight difference in two columns would probably lead to different buckets.

I do understand the relation between the signature matrix to Jacard distance but I don't understand the next step which is essentially hashing that distributes items evenly.

0

There are 0 best solutions below