Consider the following simple code with LINQ OrderBy and ThenBy:
static void Main()
{
var arr1 = new[] { "Alpha", "Bravo", "Charlie", };
var coStr = Comparer<string>.Create((x, y) =>
{
Console.WriteLine($"Strings: {x} versus {y}");
return string.CompareOrdinal(x, y);
});
arr1.OrderBy(x => x, coStr).ToList();
Console.WriteLine("--");
var arr2 = new[]
{
new { P = "Alpha", Q = 7, },
new { P = "Bravo", Q = 9, },
new { P = "Charlie", Q = 13, },
};
var coInt = Comparer<int>.Create((x, y) =>
{
Console.WriteLine($"Ints: {x} versus {y}");
return x.CompareTo(y);
});
arr2.OrderBy(x => x.P, coStr).ThenBy(x => x.Q, coInt).ToList();
}
This simply uses some comparers that write out to the console what they compare.
On my hardware and version of the Framework (.NET 4.6.2), this is the output:
Strings: Bravo versus Alpha Strings: Bravo versus Bravo Strings: Bravo versus Charlie Strings: Bravo versus Bravo -- Strings: Bravo versus Alpha Strings: Bravo versus Bravo Ints: 9 versus 9 Strings: Bravo versus Charlie Strings: Bravo versus Bravo Ints: 9 versus 9
My question is: Why would they compare an item from the query to itself?
In the first case, before the -- separator, they do four comparisons. Two of them compare an entry to itself ("Strings: Bravo versus Bravo"). Why?
In the second case, there should not ever be a need for resorting to comparing the Q properties (integers); for there are no duplicates (wrt. ordinal comparison) in the P values, so no tie-breaking from ThenBy should be needed ever. Still we see "Ints: 9 versus 9" twice. Why use the ThenBy comparer with identical arguments?
Note: Any comparer has to return 0 upon comparing something to itself. So unless the algorithm just wants to check if we implemented a comparer correctly (which it will never be able to do fully anyway), what is going on?
Be aware: There are no duplicates in the elements yielded by the queries in my examples.
I saw the same issue with another example with more entries yielded from the query. Above I just give a small example. This happens with an even number of elements yielded, as well.
Ok, let's see the possibilities here:
Tis a value typeIn order to check if it's comparing an item against itself, it first needs to check if both items are the same one. How would you do that?
You could call
Equalsfirst and thenCompareToif the items are not the same. Do you really want to do that? The cost ofEqualsis going to be roughly the same as comparing so you'd actually be doubling the cost of the ordering for exactly what?OrderBysimply compares all items, period.Tis a reference typec# doesn't let you overload only with generic constraints so you'd need to check in runtime if
Tis a reference type or not and then call a specific implementation that would change the behavior described above. Do you want to incurr in that cost in every case? Of course not.If the comparison is expensive, then implement in your comparison logic a reference optimization to avoid incurring in stupid costs when comparing an item to itself, but that choice must be yours. I'm pretty sure
string.CompareTodoes precisely that.I hope this makes my answer clearer, sorry for the previous short answer, my reasoning wasnt that obvious.