I came across this strange behaviour in C# vs Java. The HashSet in Java seems to compare the contents in an object and reject duplicates or gives out a boolean value for when it is checked for contains. The same isn't true in C#. Here are the code examples:
Java code:
HashSet<List<Integer>> setOfLists = new HashSet<List<Integer>>();
List<Integer> list1 = Arrays.asList(1, 2, 3);
List<Integer> list2 = Arrays.asList(1, 2, 3);
// Adding list1 and list2 to the HashSet
setOfLists.add(list1);
setOfLists.add(list2);
// Checking if list1 is in the HashSet
System.out.println(setOfLists.contains(list1)); // Output is true
// Checking if new Array with same content is in the HashSet
System.out.println(setOfLists.contains(Arrays.asList(1, 2, 3))); // Output is true
// Checking the length of the HashSet
System.out.println(setOfLists.size()); // Output is 1
C# Code:
HashSet<List<int>> setOfLists = new HashSet<List<int>>();
List<int> list1 = new List<int> { 1, 2, 3 };
List<int> list2 = new List<int> { 1, 2, 3 };
// Adding list1 and list2 to the HashSet
setOfLists.Add(list1);
setOfLists.Add(list2);
// Checking if list1 is in the HashSet
Console.WriteLine(setOfLists.Contains(list1)); // Output is true
// Checking if new Array with same content is in the HashSet
Console.WriteLine(setOfLists.Contains(new List<int> { 1, 2, 3 })); // Output is false
// Checking the length of the HashSet
Console.WriteLine(setOfLists.Count); // Output is 2
I don't understand why this difference. As we can see in the Print statements, C# is unable to find the duplicates by the contents in case of objects. HashSets are supposed to be used to eliminate duplicates and do so in O(1) time.
The biggest difference I see from the previous answers are all the solutions suggested have time complexity of O(N) where N is the number of elements (lists) in the HashSet. However, in Java, this seems to happen without having to explicitly modifying the comparers, and with O(1) complexity for at least fairly smaller Lists.
Is there any alternate options to achieve that in C# to achieve this in O(1) time complexity?
The problem is that the
List<T>class in C# does not overwrite theEquals()orGetHashCode()method, see Check if two lists are equal. This means that the two listsare not equal (according to
Equals()), see the following example code:This could generate the following output:
Therefore, the
HashSet<List<int>>instance contain two list references, which are (still) not equal, as seen in your code:The solution is to provide a custom IEqualityComparer implementation the
HashSetinstance should use. It could look like this:With this implementation you get the expected result as in java:
This will generate the following output: