Foreach loop takes a very long time to break out of

2.1k Views Asked by At

Scraping a webpage, containing about 250 table divisions. Using WatiN and WatinCSSSelectors

First I select all td tags with attribute 'width=90%':

var allMainTDs = browser.CssSelectAll("td[width=\"90%\"]");

Then I make a foreach loop, stick the contents of the var into a List. The int is there to check what td tag the loop is currently at.

List<Element> eletd = new List<Element>();
int i = 0;
foreach (Element td in allMainTDs)
{
    eletd.Add(td);
    i++;
    Console.WriteLine(i);                    
}

It reaches the 250th tag fairly quickly. But then it takes about 6 minutes (timed with a StopWatch object) to go onto the next statement. What is happening here?

3

There are 3 best solutions below

3
On BEST ANSWER

A foreach loop is roughly equivalent to the following code (not exactly, but close enough):

IEnumerator<T> enumerator = enumerable.GetEnumerator();
try
{
    while (enumerator.MoveNext())
    {
        T element = enumerator.Current;
        // here goes the body of the loop
    }
}
finally
{
    IDisposable disposable = enumerator as System.IDisposable;
    if (disposable != null) disposable.Dispose();
}

The behavior you describe points to the cleanup portion of this code. It's possible that the enumerator for the result of the CssSelectAll call has a heavy Dispose method. You could confirm this by replacing your loop with something like the code above, and omit the finally block, or set breakpoints to confirm Dispose takes forever to run.

2
On

You could try this:

var eletd = new List<Element>(allMainTDs);
2
On

If you under .net 4.0 and you execution environment allows for parallelism, you may be should try the

  Prallel.ForEach(..);