I was wondering how this could be achieved in the most efficient way.
Should I use
a.RemoveAll(x => b.AsParallel().Any(y => y == x));
or
a.AsParallel().Except(b.AsParallel());
or something else?
Can anyone explain what the underlying difference is? It seems to me, from measuring, that the second line is slower. What is the reason for this?
Using the second option, with two
ParallelQuery<T>operations, will perform the entire operation in parallel:The first option does a sequential check for the removal, and must build the
ParallelQuery<T>for each iteration, which will likely be far slower.Depending on the number of elements, however, it may actually be faster to run this without
AsParallel:In many cases, the overhead of parallelizing for smaller collections outweighs the gains. The only way to know, in this case, would be to profile and measure the options involved.
This may be due to a lot of factors. First, make sure you're running outside of the VS host in a release build (this is a common issue). Otherwise, this may be due to the size of the collections, and data types involved.