Why is topological sort needed for Longest Path in Directed Acyclic Graph?

11.2k Views Asked by At

Problem: Given a Weighted Directed Acyclic Graph (DAG) and a source vertex s in it, find the longest distances from s to all other vertices in the given graph.

Please find the reference graph: link

Why do we need topological sorting? Can we not simply use modified BFS from source vertex. Why do we care so much about the linear ordering.

If this is a repetition then kindly redirect me to relevant answers.

Thanks

4

There are 4 best solutions below

0
On BEST ANSWER

If we don't sort it, we don't know which adjacent vertex to choose first and it may lead to a situation where we use distance of a vertex v to update distances of its adjacent vertices adj[v], but after that, the distance of vertex v gets updated, so vertices from adj[v] could also get bigger distances, but we won't visit them anymore.

Example based on the graph you have referenced (http://www.geeksforgeeks.org/wp-content/uploads/LongestPath.png):
Let's say that at this step:
Step 1
Say, we start to traverse the graph from vertex '0', and we choose vertex with distance 6 (instead of vertex with distance 2, which we would have chosen if we had used topological order). Already processed vertices are green, vertex currently being processed is red:
Step 2
We have updated the distance of the last vertex to 7 and we won't increase it, however if we had visited vertex with distance 2 in previous step, the distance of this vertex would have been 10: Step 3

1
On

Topological sorting is not required if you greedily process nodes everytime an update happens. If you greedily revisit all neighbours which you can improve the longest distance. For example at 9, you can just check that 9+1 > 7 and revisit the 7th node to update it.

Last step of the BFS

0
On

If we can keep track of visited nodes, it should be possible to use a recursive DFS and some memoization.

Start from starting node. For each neighbor, calculate (distance to neighbor + distance from neighbor to goal). Take the max of those, memoize it as the max from this node, and return it.

Basically, if you know the max distance from your neighbors to the goal, you know the max distance from you to the goal. And if you memoize, you won't visit any node more than once.

0
On

If you know how to do it with "modified BFS", then you can do it that way. How do you propose to do it with "modified BFS", BTW?

Meanwhile, the algorithm presented at the link does it through first toposorting the graph. That's just how that algorithm is designed.

Now, toposort order is produced by DFS algirithm on its backtracking stage. Unfortunately, DFS generates toposort order in reverse direction. For this reason, we cannot "embed" the specific processing of Longest Path algorithm directly into DFS. (This algorithm requires processing in forward direction.) Hence we have to adopt two-pass approach: do pure DFS first, to build a full toposorted sequence, and then make a second pass to find the longest path.

In many real-life situations algorithms based on toposort are valuable because vertices of the DAG might be already stored in toposorted order. I.e. toposorting is done only once at pre-processing stage. After that various toposort-based algorithms effectively turn into very efficient one-pass algorithms with no extra memory requirement (as opposed to such algorithms as BFS or DFS, which require extra memory for their stacks, queues etc.)