How to get an int type hash using message digest for bloom filters

417 Views Asked by At

I am trying to implement a bloom filter in java and one type of hash needs to be using the message digest. During the add method the other hashes I have created are used to set the index of the bitset to true. I need to create a hash using message digest to achieve the same goal but I cannot find a way to return an int. My add method and attempt at the hash:

public void add(String element) {
    int index = Math.abs(element.hashCode())%size;
    int index1 = myHash(element);

    //  int index2 = mdHash(element);

    b.set(index, true); 
    b.set(index1, true);
}

public int mdHash(String message) throws NoSuchAlgorithmException {
    MessageDigest md = MessageDigest.getInstance("SHA-256");
    md.update(message.getBytes()); 
    byte[] digest = md.digest();       
    return ;
}

How can I create a hash using the digest that also can be used to set an index to true?

1

There are 1 best solutions below

0
Thomas Mueller On

For a Bloom filter, using SHA-256 isn't normally needed: it is slow. Instead, I would use e.g. Murmur hash. Or just hashCode(), and then use another hash to generate a 64 hash.

Then, from that hash, you can generate many number as needed for the Bloom filter, for example as follows (see also the fastfilter_java project):

public void add(long key) {
    long hash = Hash.hash64(key, seed);
    long a = (hash >>> 32) | (hash << 32);
    long b = hash;
    for (int i = 0; i < k; i++) {
        data[Hash.reduce((int) (a >>> 32), arraySize)] |= 1L << a;
        a += b;
    }
}