I compare task data from Microsoft project using a nested for
loop. But since the project has many records (more than 1000), it is very slow.
How do I improve the performance?
for (int n = 1; n < thisProject.Tasks.Count; n++)
{
string abc = thisProject.Tasks[n].Name;
string def = thisProject.Tasks[n].ResourceNames;
for (int l = thisProject.Tasks.Count; l > n; l--)
{
// MessageBox.Show(thisProject.Tasks[l].Name);
if (abc == thisProject.Tasks[l].Name && def == thisProject.Tasks[l].ResourceNames)
{
thisProject.Tasks[l].Delete();
}
}
}
As you notice, I am comparing the Name
and ResourceNames
on the individual Task
and when I find a duplicate, I call Task.Delete
to get rid of the duplicate
A hash check should be lot faster in this case then nested-looping i.e. O(n) vs O(n^2)
First, provide a equality comparer of your own
Don't worry too much about the
GetHashCode
function implementation; this is just a broiler-plate code which composes a unique hash-code from its propertiesNow you have this class for comparison and hashing, you can use the below code to remove your dupes
As you notice, you are simply scanning all your elements, while storing them into a
HashSet
. ThisHashSet
will check, based on our equality comparer, if the provided element is a duplicate or not.Now, since you want to delete it, the detected dupes are deleted. You can modify this code to simply extract the
Unique
items instead of deleting the dupes, by reversing the condition toif (set.Add(thisProject.Tasks[i]))
and processing within thisif