Is there a way to find semi-connected( unilaterally connected ) components in a directed graph with networkX?

332 Views Asked by At

I don't know if this is a weird question. Like in the title, is there a way to find semi-connected( unilaterally connected ) components in a directed graph with networkX? I do read the document of networkX but it just has is_semiconnected to check if a di-graph is connected or not, i don't see any other option( am I missing anything?). What I want is something like nx.connected_components. For example I have a graph like this :enter image description here

I want the result to be : [A,B,C], [A,E,F], [X] and [Y]. The output data should be in the order, because in my case, A->B mean A happen before B,B->C mean B happen before C, so I want to keep the order as it is in the graph. Is there a way to do this? If a node is not connected to any other node, it should be considered semi-connected too( for my case that i'm working ). Like in the example, if I have another standalone node called X, then X should be in the final answer too. I know this is kinda weird for directed graph, but need it for my data and output. It's like finding maximal semi-connected components.

Also i can make sure that my graph won't have cycles in it.

2

There are 2 best solutions below

4
Stef On

Here is a bruteforce function which will yield all maximally semiconnected subgraphs.

This uses a function maximal_subsets(seq, pred) to yield all maximal subsets of a sequence seq for a predicate pred.

You can find an explanation for that function in an answer to this related question:

from itertools import chain, combinations
from collections import defaultdict
from networkx import is_semiconnected

def powerset_decreasing(iterable):
    "powerset_decreasing([1,2,3]) --> (1,2,3) (1,2) (1,3) (2,3) (1,) (2,) (3,)"
    s = list(iterable)
    return chain.from_iterable(combinations(s, r) for r in range(len(s), 0, -1))

def maximal_subsets(seq, pred):
    cache = defaultdict(set)
    yielded_something = False
    for subset in powerset_decreasing(seq):
        supersets = set.intersection(*(cache[x] for x in subset))
        if (not supersets) and pred(subset):
            yield subset
            yielded_something = True
            for x in subset:
                cache[x].add(subset)
    if (not yielded_something) and pred(()):
        yield ()

def maximal_semiconnected_subgraphs(G):
    def pred(node_subset):
        return is_semiconnected(G.subgraph(node_subset))
    yield from maximal_subsets(G.nodes, pred)

Testing:

## A -> B \    /> F -> G
##         > E 
## C -> D /    \> H -> I
from networkx import DiGraph
G = DiGraph()
G.add_edges_from('AB BE CD DE EF FG EH HI'.split())

components = [''.join(h) for h in maximal_semiconnected_subgraphs(G)]

print("COMPONENTS: ", components)
# COMPONENTS:  ['ABEFG', 'ABEHI', 'ECDFG', 'ECDHI']
8
David Eisenstat On

Here’s an algorithm that makes effective use of NetworkX.

  1. Compute the strong components of G and contract them, forming a directed acyclic graph.
  2. Transitively reduce the DAG.
  3. Enumerate all maximal paths in the DAG. (The technical wording of this step makes the previous step redundant, but after the transitive reduction, we can just enumerate all paths from a source to a sink.)
  4. For each path, expand it back to the corresponding set of vertices in G.

Python implementation:

import itertools
import networkx


def semiconnected_components(G):
    G0 = networkx.quotient_graph(G, list(networkx.strongly_connected_components(G)))
    G1 = networkx.transitive_reduction(G0)
    sources = {v for v in G1.nodes() if G1.in_degree(v) == 0}
    sinks = {v for v in G1.nodes() if G1.out_degree(v) == 0}
    for source in sources:
        for path in networkx.all_simple_paths(G1, source, sinks):
            yield list(itertools.chain(*map(sorted, path)))
    # Work around a bug in all_simple_paths (does not return length-0 paths, see
    # https://github.com/networkx/networkx/issues/6690).
    yield from map(list, sources & sinks)


def main():
    G = networkx.DiGraph()
    G.add_nodes_from({"X"})
    G.add_edges_from({"AB", "BC", "AE", "EF"})
    for S in semiconnected_components(G):
        print(S)
    print()
    G.add_edges_from({"CD", "DC", "CF"})
    for S in semiconnected_components(G):
        print(S)


if __name__ == "__main__":
    main()

Sample output:

['A', 'B', 'C']
['A', 'E', 'F']
['X']

['A', 'B', 'C', 'D', 'F']
['A', 'E', 'F']
['X']