Why does this dfs code never terminate when the adjacency list is stored using a deafultdict?

48 Views Asked by At

I've recently started learning graphs and tried to implement it using Python. Creating a graph's adjacency list using a defaultdict, adding edges, that bit seemed to work fine. Now, when I tried to implement DFS code, the program just won't terminate.
Here's the program I wrote:

from collections import defaultdict
class Graph:
    def __init__(self):
        self.graph = defaultdict(list)

    def addEdge(self, u, v):
        self.graph[u].append(v)
        self.graph[v].append(u)

# Taking input from the user
g = Graph()
root = input("Enter the root node: ")
for _ in range(int(input("Enter number of edges: "))):
    u,v = input().split()
    g.addEdge(u,v)
adj_list = dict(g.graph)

# DFS traversal
path = set()
stack = []

stack.append(root)

while stack:
    u = stack.pop()

    if u not in path:
        path.add(u)
    
    if u not in adj_list: # for leaf nodes
        continue

    stack.extend(adj_list[u])

print(path)

And here's the output:

Enter the root node: A
Enter number of edges: 10
A B
A C
A D
B E
C G
C F
D H
E I
F J
G K


And then it just gets stuck. It won't take any input, won't display anything, all I can do is ctrl+C(Keyboard interrupt) in order to stop it.

When I hardcode the adjacency list like this:

adj_list = {"A":["B","C","D"],  
   "B":["E"],  
   "C":["G","F"],  
   "D":["H"],  
   "E":["I"],  
   "F":["J"],  
   "G":["K"]}  

Here's the output:

{'I', 'C', 'G', 'E', 'A', 'B', 'D', 'H', 'J', 'K', 'F'}

I tried hardcoding the adjacency list and not taking input from the user, then it seems to work fine. But I can't get my head around why it won't work if I take input from the user. I can't always hardcode the graph, there has to be a solution. Please help me, I can't calm down unless I understand what is wrong with this code.

1

There are 1 best solutions below

0
trincot On

The problem is that in the main loop your code adds the neighbors of a node on the stack, even if that node was already on the path.

Imagine a simple graph with 2 connected nodes: "a" and "b". As your graph is made undirected (see your addEdge), the adjacency list will be: { "a": ["b"], "b": ["a"] }

This means that whichever node is popped from the stack, it will always lead to an extension of the stack with one more node. And so the loop will visit "a", "b", "a", "b", "a", ...etc.

It worked for your hard-coded adjacency list, because it isn't an undirected graph, but a directed tree -- it has no cycles, and no edges that can be traversed in both directions. You wouldn't be able to create this graph with the Graph class you have.

A traversal should not visit the same node more than once, so when you detect that a node is already in the current path, you should not put its neighbors on the stack.

Change this:

while stack:
    u = stack.pop()

    if u not in path:
        path.add(u)
    
    if u not in adj_list: # for leaf nodes
        continue

    stack.extend(adj_list[u])

...to this:

while stack:
    u = stack.pop()

    if u not in path:  # when not yet visited
        path.add(u)  # visit the node
        stack.extend(adj_list[u])  # only here stack the neighbors for inspection

Remarks

  • You wouldn't detect leaf nodes with the if you had for that purpose (unless the graph was a trivial single node), because there is always the edge you came from.

  • The traversal will only visit nodes that can be reached from the root node. This may be the intention, but just realise that if the input graph is disconnected, this algorithm will not visit all nodes in the graph.