I have a datatable that has a traverse tree structure on it, it has lots of columns but the ones that matter are as follows: |RepID|LeaderID|Depth|
I want to set the tree's depth as: RepID that has no LeaderID (null) should be depth 0 RepID that has as a leader the ones that have depth 0 should be depth 1 and so on
Up to now I made a recursive algorithm that is getting the job done, however since I'm iterating over 400,000 rows it's taking a lot of time.
Up to now my code is as follows:
public static DataTable SetTreeLevel(DataTable dtReps)
{
//Getting the 1st level (reps without a leader)
var first_level = from r in dtReps.AsEnumerable()
where (r.Field<int?>("LeaderID") == null || r.Field<int?>("LeaderID") == 0)
select r;
//Setting the level for the reps obtained
foreach (var row in first_level)
{
row["Depth"] = 1;
}
//Setting the next levels
return setTreeLevelRecursive(dtReps, 2);
}
private static DataTable setTreeLevelRecursive(DataTable dtReps, int depth)
{
//Getting reps of the last level (depth -1)
var last_level = from r in dtReps.AsEnumerable()
where r.Field<int?>("Depth") == depth - 1
select r.Field<int?>("RepID");
//List to improve performance
List<int?> last_level_list = last_level.ToList<int?>();
//Getting the next level reps (leader is on the last level list)
var actual_level = from r in dtReps.AsEnumerable()
where last_level_list.Contains(r.Field<int?>("LeaderID"))
select r;
//List to improve performance
List<DataRow> actual_level_list = actual_level.ToList<DataRow>();
foreach (DataRow row in actual_level_list)
{
row["Depth"] = depth;
}
//Validating if there are reps without depth
if ((from r in dtReps.AsEnumerable()
where r.Field<int?>("Depth") == null
select r).Count() > 0)
{
//Asignando siguiente nivel
setTreeLevelRecursive(dtReps, depth + 1);
}
//Regresando resultado
return dtReps;
}
Edit: Using Servy's optimization I coded the following:
var lookup = dtReps.AsEnumerable().ToLookup(x => x.Field<int?>("LeaderID"));
//First level
var first_level = from r in dtReps.AsEnumerable()
where (r.Field<int?>("LeaderID") == null || r.Field<int?>("LeaderID") == 0)
select Tuple.Create(r.Field<int>("RepID"), 1);
var rows = Traverse(first_level, node => lookup[node.Item1]
.Select(row => Tuple.Create(row.Field<int>("RepID"), node.Item2 + 1))).ToList();
foreach (var r in rows)
{
(from r_nivel in dtReps.AsEnumerable()
where r_nivel.Field<int>("RepID") == r.Item1
select r_nivel).FirstOrDefault()["Depth"] = r.Item2;
}
But the foreach takes a lot of time Thanks!
First, you can define a Rep object:
Then you populate a List from your DataTable using:
Then you create a lookup table for each leader's reps by:
Now you use LeaderLookup to recursively set Depths. This is very fast, I am using similar data with around 3,000 items, and it can be populate under a second.
To do that, you define this function:
Then you invoke the function with:
after the ToDictionary statement. After that call completes you will have a List with depths all set correctly. You can iterate through that to save it to the Database.
How long does that take?