Why Java implement different hashcode method of Set and ArrayList?

5.2k Views Asked by At
 // this is the hashCode method of Set
public int hashCode() {
    int h = 0;
    Iterator<E> i = iterator();
    while (i.hasNext()) {
        E obj = i.next();
        if (obj != null)
            h += obj.hashCode();
    }
    return h;
}



//this is the hashCode method of List
public int hashCode() {
    int hashCode = 1;
    for (E e : this)
        hashCode = 31*hashCode + (e==null ? 0 : e.hashCode());
    return hashCode;
}

why java use these two different approaches? Is there anything related to the characteristic of Set and List? Why it uses 31 but not other numbers? Thanks!

2

There are 2 best solutions below

0
On BEST ANSWER

Sets are unordered, so {a, b, c} must have the same hash code as {c, b, a}. Addition is commutative, so adding the elements' hashCodes gives you that property.

Lists are ordered, so while [a, b, c] may have the same hash code as [c, b, a], it does not need to -- and it'd be better if it didn't, since as much as possible non-equal objects should try to have non-equal hashCodes. The ArrayList.hashCode implementation has that property.

Note that Set and List both define how implementations must define equals and hashCode (Set.hashCode, List.hashCode), so any (compliant) implementation of those respective collections is going to look pretty much the same. This gives you the useful property that a Set containing the same elements is equal (and thus has the same hashCode) as any other Set, regardless of the underlying implementations.

0
On

I will try to elaborate a bit more analytical, on this interesting topic.

As defined from Oracle, there must be a contract between equals and hashCode.

The general contract of hashCode is:

Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.

If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.

It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.

So let's see what the contract for equals is for List.

As documented by Oracle for equals method:

In other words, two lists are defined to be equal if they contain the same elements in the same order.

So if we check now how the hashCode is implemented for List

public int hashCode() {
    int hashCode = 1;
    for (E e : this)
        hashCode = 31*hashCode + (e==null ? 0 : e.hashCode());
    return hashCode;
}

The line hashCode = 31*hashCode ... ensures that the order of the list elements for which the hashcode method calculates the total result, affects the final result.

If this was not there and it was defined simple as hashCode = hashCode + (e==null ? 0 : e.hashCode()); then 2 lists with the same elements in different order might be unequal according to equals but produce the same hashcode. This is not necessary according to contract, but it could improve the performance so that is why Oracle applied it.

For Set implementation of hashcode this is not required since the ordering is not related at all with equals of Set. 2 Sets can be unequal and this is not related at all with the ordering of elements since Set by default does not have a specific order.

So the implemented hashcode for Set does not need any shuffling with .. 31*hashcode... and so it can be as is.

while (i.hasNext()) {
        E obj = i.next();
        if (obj != null)
            h += obj.hashCode();
    }