Java 7 changed the sorting algorithm such that it throws an
java.lang.IllegalArgumentException: "Comparison method violates its general contract!"
in some cases when the used comparator is buggy. Is it possible to tell what kind of bug in the comparator causes this? In my experiments it did not matter if x != x , it also did not matter if x < y and y < z but z < x , but it did matter if x = y and y = z but x < z for some values x, y, z. Is this generally so?
(If there were a general rule to this, it might be easier to look for the bug in the comparator. But of course it is better to fix all bugs. :-) )
In particular, the following two comparators did not make TimSort complain:
final Random rnd = new Random(52);
Comparator<Integer> brokenButNoProblem1 = new Comparator<Integer>() {
@Override
public int compare(Integer o1, Integer o2) {
if (o1 < o2) {
return Compare.LESSER;
} else if (o1 > o2) {
return Compare.GREATER;
}
return rnd.nextBoolean() ? Compare.LESSER : Compare.GREATER;
}
};
Comparator<Integer> brokenButNoProblem2 = new Comparator<Integer>() {
@Override
public int compare(Integer o1, Integer o2) {
if (o1 == o2) {
return Compare.EQUAL;
}
return rnd.nextBoolean() ? Compare.LESSER : Compare.GREATER;
}
};
but the following comparator did make it throw up:
Comparator<Integer> brokenAndThrowsUp = new Comparator<Integer>() {
@Override
public int compare(Integer o1, Integer o2) {
if (Math.abs(o1 - o2) < 10) {
return Compare.EQUAL; // WRONG and does matter
}
return Ordering.natural().compare(o1, o2);
}
};
UPDATE: in some real life data we had a failure where there were no x,y,z with x = y and y = z but x < z . So It seems my guess was wrong, and it doesn't seem this specific kind failure only. Any better ideas?
After looking at the code of
ComparableTimSort
I am not quite sure. Let's analyze it. Here is the only method that throws it (there is a similar method that does the same only with exchanged roles, so analyzing one of them is enough).The method performs a merging of two sorted runs. It does a usual merge but starts "gallopping" once it encounters that one side starts "winning" (I.e., being always less than the other) all the time. Gallopping tries to make things faster by looking ahead more elements instead of comparing one element at a time. Since the runs should be sorted, looking ahead is fine.
You see that the exception is only throw when
len1
is0
at the end. The first observation is the following: During the usual merge, the exception can never be thrown since the loop aborts directly oncelen
this1
. Thus, the exception can only be thrown as result of a gallop.This already gives a strong hint that the exception behaviour is unreliable: As long as you have small data sets (so small that a generated run may never gallop, as
MIN_GALLOP
is7
) or the generated runs always coincidentally generate a merge that never gallops, you will never receive the exception. Thus, without further reviewing thegallopRight
method, we can come to the conclusion that you cannot rely on the exception: It may never be thrown no matter how wrong your comparator is.