I've been asked to query a time logging database, to display all the work done for given projects. Each project is broken into tasks, each of which may itself be broken into tasks. The task hierarchy can be an arbitrary number of levels deep. Part of the requirement is to provide total time worked for each task or node in the hierarchy (not just the leaf level nodes but all nodes, including the top level project node, the leaf level nodes and all nodes in between).
Working with such a hierarchy I assume it may be useful to use the HIERARCHYID data type. Is there any way of doing something like a SUM with ROLLUP on a hierarchy, to give the sub-totals for each node in the hierarchy?
I assumed this sort of aggregate rollup on a hierarchy would be a common requirement but I've had no luck at all finding how to do it, or even if it is possible.
Figured out how to do it. The method is a bit convoluted, perhaps someone else can come up with a neater version.
The method involves four steps:
Run the ROW_NUMBER function over all the tasks for the given projects. Partition by ParentId so that all the child tasks of a given parent are numbered 1, 2, 3, 4, etc. This works at all levels of the task hierarchy;
Use a recursive CTE (common table expression) to walk up the task hierarchy from the leaf level to the top. This will build up the structure of the task hierarchy from the parent-child relationship within the TimeCode table. Originally I tried to include the ROW_NUMBER function here but that didn't work due to the way Microsoft has implemented CTEs;
Add a HIERARCHYID column to the structure built up in step 2;
Do a self-join on the recordset to get all the children of each node in the structure. Group by the parent node and sum the times recorded for each child node. Note that the HIERARCHYID method IsDescendantOf returns not only the children of a node but the node itself as well. So if any time has been recorded against the parent task as well as the children it will be included in the total time for that parent node.
Here's the script:
Results:
First Recordset (task structure with HIERARCHYID column):
Second Recordset (tasks with sub-totals for each task):