I have a hierarchy of nodes that I need to use for an analysis. Sort of like this
I'm trying to find an algorithm that will allow me to find the nearest common ancestors between two nodes. I know there are algorithms that find the lowest common ancestor, but I haven't been able to find one that allows us to find the nearest few.
For example, in the picture I linked above, if I give it two nodes: 0 and 1, it should return 2 and 5. i.e. It should return all common ancestors of the nodes that don't have descendants that are also common ancestors. A naive approach would be to get all the common ancestors of 0 and 1: {7, 5, 6, 3, 2}, and then eliminate 7 since it has descendants in the set. Then it'll eliminate 6 and 3 as well. So it would return SLCA = {5,2}. At the moment, I've stored all the ancestors of each node in a list. So this naive approach is possible. However, I'd like do a more efficient method that will be scalable for even very large graphs. Ultimately, for a given graph, I'd like to be able to compute the pairwise SLCA of each pair of nodes. I think this brute force approach would be slow for very large graphs.
Does anyone know of such an algorithm? I've been reading these papers, but they are pretty dense, and I've been stuck trying to understand them.
https://www.dcs.warwick.ac.uk/~czumaj/PUBLICATIONS/DRAFTS/LCA-Max-witness.pdf
http://www.ccs.neu.edu/home/dherman/browse/projects/mini-javac/papers/bender01finding.pdf
https://algo2.iti.kit.edu/download/fischer10new.pdf
https://algo2.iti.kit.edu/download/fischer10new.pdf
I appreciate the help
Finding all of the common ancestors, you can do this with two BFS and an array that keep the node which are in both trees: Time -
O(V + E)
, Memory:O(V)
.Now you can find all of the ancestors which do not have descendants in the group, which are all the nodes that are the closest to the roots. This will not take long, take the sub graph of this group and find a node with no incoming edges in
O(V + E)
time (By iterating over the nodes), andO(V)
memory. This will be your answer.So in total - Time
O(V + E)
, MemoryO(V)
.Duplicate the graph, now we have
G1
andG2
. For each node inG1
create a new edge over to the corresponding node inG2
. 'Flip' all the edges inG2
- if there was an edge fromv
tou
now it's fromu
tov
.Call this new graph
G
, call the two nodes you need to find the SLCAu
andv
.It is easy to show that the shortest route from
u
inG1
to the correspondingv
inG2
will pass through an edge fromG1
toG2
and that the node of this edge (both nodes on this edge are corresponding) will be in the SLCA group.That is since if you look at the original graph, the paths we took from each node
v
,u
is the shortest to meet, which is the definition of the SLCA group.Now you need to find all the path of the shortest path length and extract all of these nodes.
Time:
O(E + V)
(shortest path - BFS)Memory:
O(V)
(BFS)