Java: HashSet what is the Compare concept?

2.6k Views Asked by At

Coming from a c++ world, I find reading of the HashSet documentation somewhat hard:

In c++, you would have:

which in turns points to:

Which makes it obvious the requirement for the type of element handled by a std::set. My question is: What are the requirements for the type (E) of elements maintained by a Set in Java ?

Here is a short example which I fail to understand:

import gdcm.Tag;
import java.util.Set;
import java.util.HashSet;

public class TestTag
{
  public static void main(String[] args) throws Exception
    {
      Tag t1 = new Tag(0x8,0x8);
      Tag t2 = new Tag(0x8,0x8);
      if( t1 == t2 )
        throw new Exception("Instances are identical" );
      if( !t1.equals(t2) )
        throw new Exception("Instances are different" );
      if( t1.hashCode() != t2.hashCode() )
        throw new Exception("hashCodes are different" );
      Set<Tag> s = new HashSet<Tag>();
      s.add(t1);
      s.add(t2);
      if( s.size() != 1 )
        throw new Exception("Invalid size: " + s.size() );
    }
}

The above simple code fails with:

Exception in thread "main" java.lang.Exception: Invalid size: 2 at TestTag.main(TestTag.java:42)

From my reading of the documentation only the equals operator needs to be implemented for Set:

What am I missing from the documentation ?

3

There are 3 best solutions below

4
On BEST ANSWER

I just tried to reproduce your issue, and maybe you just didn't override equals and/or hashSet correctly.

Take a look at my incorrect implemenation of Tag:

public class Tag {

private int x, y;

public Tag(int x, int y) {
    this.x = x;
    this.y = y;
}

public boolean equals(Tag tag) {
    if (x != tag.x) return false;
    return y == tag.y;
}

@Override
public int hashCode() {
    int result = x;
    result = 31 * result + y;
    return result;
}
}

Looks quite ok doesn't it? But the problem is, I actually do not override the correct equals method, I overloaded it with my own implementation.

To work correctly, equals has to look like this:

@Override
public boolean equals(Object o) {
    if (this == o) return true;
    if (o == null || getClass() != o.getClass()) return false;

    Tag tag = (Tag) o;

    if (x != tag.x) return false;
    return y == tag.y;
}
3
On

What am I missing from the documentation ?

You are looking at the wrong part of the documentation.

The C++ set is an "sorted set of unique objects", and are "usually implemented as red-black trees."

In Java, Set is a more abstract concept (it's an interface, not a class) with multiple implementations, most notably the HashSet and the TreeSet (ignoring concurrent implementations).

As you can probably guess from the name alone, the Java TreeSet is the equivalent of the C++ set.

As for requirements, HashSet uses the hashCode() and equals() methods. They are defined on the Object class, and needs to be overridden on classes that needs to be in a HashSet or as keys in a HashMap.

For TreeSet and keys of TreeMap, you have two options: Provide a Comparator when creating the TreeSet (similar to C++), or have the objects implement the Comparable interface.

0
On

I guess this was simply a combination of bad luck and misunderstanding of HashSet requirement. Thanks to @christophe for help, I realized the issue when I tried adding in my swig generated Tag.java class:

@Override
public boolean equals(Object o) {
}

I got the following error message:

gdcm/Tag.java:78: error: method does not override or implement a method from a supertype
  @Override
  ^
1 error
1 warning

Which meant my error was simply:

  • I had the wrong signature in the first place: boolean equals(Object o) != boolean equals(Tag t)

The hint was simply to use the @Override keyword.


For those asking for the upstream code, the Java code is generated by swig. The original c++ code is here: