We've had a little introspection over at Code Review of LINQ Enumerable and discovered different strategies for iterating a IEnumerable<TSource>
to yield us some information about it.
Any
public static bool Any<TSource>(
this IEnumerable<TSource> source, Func<TSource, bool> predicate)
{
if (source == null) throw Error.ArgumentNull("source");
if (predicate == null) throw Error.ArgumentNull("predicate");
foreach (TSource element in source) {
if (predicate(element)) return true;
}
return false;
}
public static bool Any<TSource>(this IEnumerable<TSource> source)
{
if (source == null) throw Error.ArgumentNull("source");
using (IEnumerator<TSource> e = source.GetEnumerator()) {
if (e.MoveNext()) return true;
}
return false;
}
Count
public static int Count<TSource>(this IEnumerable<TSource> source)
{
if (source == null) throw Error.ArgumentNull("source");
ICollection<TSource> collectionoft = source as ICollection<TSource>;
if (collectionoft != null) return collectionoft.Count;
ICollection collection = source as ICollection;
if (collection != null) return collection.Count;
int count = 0;
using (IEnumerator<TSource> e = source.GetEnumerator()) {
checked {
while (e.MoveNext()) count++;
}
}
return count;
}
SingleOrDefault
public static TSource SingleOrDefault<TSource>(
this IEnumerable<TSource> source, Func<TSource, bool> predicate)
{
if (source == null) throw Error.ArgumentNull("source");
if (predicate == null) throw Error.ArgumentNull("predicate");
TSource result = default(TSource);
long count = 0;
foreach (TSource element in source) {
if (predicate(element)) {
result = element;
checked { count++; }
}
}
switch (count) {
case 0: return default(TSource);
case 1: return result;
}
throw Error.MoreThanOneMatch();
}
As you can see, 3 different strategies were used.
- Any: Iterate the enumerator, with early exit
- Count: Iterate the enumerator, unless ICollection, call Count
- SingleOrDefault: Iterate the enumerator, without early exit
Some observations:
- 'Any()' could have also been optimized for ICollection
- 'SingleOrDefault' could have also been optimized to exit early
- none of the methods care about IReadOnlyCollection, even though it also has a property 'Count'
Questions:
- what are the pro's and contra's of each of these strategies?
- is there a good reason why the implementation of LINQ is as it is?
Well, the first two methods are self-explanatory, aren't they? They are optimized in a way that they stop as soon as possible and also check if the type has a
Count
property to avoid the loop.But the
SingleOrDefault
with predicate has indeed a strange implementation. It could stop at the 2nd matching item since then is clear that aInvalidOperationException
must be thrown(Single...
as opposed toFirst...
ensures that there is at maximum 1 item). But instead it checks every item and counts the matches. The version without predicate has this optimization.So the question is: What the heck? And it seems that is really a bug that won't be fixed since it's just reduces performance in error-case. I can't believe it.
By the way, the same bug exists in
Enumerable.Single
.