Linq count vs IList count

1.5k Views Asked by At

If I have the following IEnumerable list which comes from some repository.

IEnumerable<SomeObject> items = _someRepo.GetAll();

What is faster:

items.Count(); // Using Linq on the IEnumerable interface.

or

List<SomeObject> temp = items.ToList<SomeObject>(); // Cast as a List

temp.Count(); // Do a count on a list

Is the Linq Count() faster or slower than casting the IEnumerable to a List and then performing a Count()?

Update: Improved the question a little bit to a bit more realistic scenario.

3

There are 3 best solutions below

2
MarcinJuraszek On BEST ANSWER

Calling Count directly is a better choice.

Enumerable.Count has some performance improvements built in that will let it return without enumerating the entire collection:

public static int Count<TSource>(this IEnumerable<TSource> source) {
    if (source == null) throw Error.ArgumentNull("source");
    ICollection<TSource> collectionoft = source as ICollection<TSource>;
    if (collectionoft != null) return collectionoft.Count;
    ICollection collection = source as ICollection;
    if (collection != null) return collection.Count;
    int count = 0;
    using (IEnumerator<TSource> e = source.GetEnumerator()) {
        checked {
            while (e.MoveNext()) count++;
        }
    }
    return count;
}

ToList() uses similar optimizations, baked into List<T>(IEnumerable<T> source) constructor:

public List(IEnumerable<T> collection) {
    if (collection==null)
        ThrowHelper.ThrowArgumentNullException(ExceptionArgument.collection);
    Contract.EndContractBlock();

    ICollection<T> c = collection as ICollection<T>;
    if( c != null) {
        int count = c.Count;
        if (count == 0)
        {
            _items = _emptyArray;
        }
        else {
            _items = new T[count];
            c.CopyTo(_items, 0);
            _size = count;
        }
    }    
    else {                
        _size = 0;
        _items = _emptyArray;
        // This enumerable could be empty.  Let Add allocate a new array, if needed.
        // Note it will also go to _defaultCapacity first, not 1, then 2, etc.

        using(IEnumerator<T> en = collection.GetEnumerator()) {
            while(en.MoveNext()) {
                Add(en.Current);                                    
            }
        }
    }
}

But as you can see it only uses generic ICollection<T>, so if your collection implements ICollection but not its generic version calling Count() directly will be much faster.

Not calling ToList first also saves you an allocation of new List<T> instance - not something overly expensive, but it's always better to avoid unnecessary allocation when possible.

4
Eric J. On

Either version requires (in the general case) that you completely iterate your IEnumerable<string>.

In some cases, the backing type provides a mechanism to directly determine the count which can be used for O(1) performance. See @Marcin's answer for details.

The version where you call ToList() will have an additional CPU overhead, though very small and possibly hard to measure. It will also allocate memory that would not otherwise be allocated. If your count is high, that would be the greater concern.

0
David L On

A very rudimentary LinqPad test indicates that calling IEnumerable<string>.Count() is faster than creating a list collection and getting the count, not to mention more memory efficient (as mentioned in other answers) and faster still when revisited the already enumerated collection.

I had an average of ~4 ticks for calling Count() off of the IEnumerable vs ~10k for creating a new list to obtain Count.

void Main()
{
    IEnumerable<string> ienumerable = GetStrings();
    var test1 = new Stopwatch();
    test1.Start();
    var count1 = ienumerable.Count();
    test1.Stop();
    test1.ElapsedTicks.Dump();

    var test2 = new Stopwatch();
    test2.Start();
    var count2 = ienumerable.ToList().Count;
    test2.Stop();
    test2.ElapsedTicks.Dump();

    var test3 = new Stopwatch();
    test3.Start();
    var count3 = ienumerable.Count();
    test3.Stop();
    test3.ElapsedTicks.Dump();
}

public IEnumerable<string> GetStrings()
{
    var testString = "test";
    var strings = new List<string>();
    for (int i = 0; i < 500000; i++)
    {
        strings.Add(testString);
    }

    return strings;
}

In the latter case, you're incurring the cycles required to create a new collection from an existing collection (which under the hood has to iterate the collection), then pull the Count property off of the collection. As a result, the Enumerable optimizations win and return the count value faster.

In the third test run, average ticks dropped to ~2 since it immediately returned the previously seen count (as highlighted below).

IColllection<TSource> collectionoft = source as ICollection<TSource>;
if (collectionoft != null) return collectionoft.Count;
ICollection collection = source as ICollection;
if (collection != null) return collection.Count;

However, the real cost here is not CPU cycles, but rather memory consumption. That's what you should be more concerned about.

Finally, as a warning, be careful to not use Count() while IN the enumeration of the collection. Doing so will re-enumerate the collection, leading to possible collisions. If you need to use count for something while iterating the collection, the proper approach is to create a new list with .ToList() and iterate that list, referencing Count