Networkx Indirect ties

94 Views Asked by At

How to calculate indirect ties in networkx? Indirect ties are defined as a relationship between two individuals who have no direct relation but are connected through a third person in their social network.

Tried checking some of the existing libraries in Networkx like common neighbors etc. but did not match the definition of indirect ties.

2

There are 2 best solutions below

3
Ben A. On BEST ANSWER

I am putting this edit up top (thanks Mad Physicist for pointing this out).

Here is one possible way using an adjacency matrix method (explanations in comments):

import networkx as nx
import numpy as np

def refined_find_indirect_ties(G): #G is a graph

    # We get the adjacency matrix
    A = nx.adjacency_matrix(G)
    
    # Now we convert it to a numpy array
    A_but_np = A.toarray()
    
    # The crucial math part– we square (dot) the matrix
    A_but_squared = np.dot(A_but_np, A_but_np)
    
    # This method doesn't guarantee only indirect technically, so we do a little check

    n = len(G) # Now create list of tuples of nodes
    indirect_ties = [(i, j) for i in range(n) for j in range(n) 
                     if i != j and A_but_squared[i, j] > 0 and A_but_np[i, j] == 0]
    
    return indirect_ties

Edit: If you want to get the node names instead of the indices in the output, it's not that different:

import networkx as nx
import numpy as np

def refined_find_indirect_ties(G): # G is a graph

    # We get the adjacency matrix
    A = nx.adjacency_matrix(G)
    
    # Now we convert it to a numpy array
    A_but_np = A.toarray()
    
    # The crucial math part– we square (dot) the matrix
    A_but_squared = np.dot(A_but_np, A_but_np)
    
    # This method doesn't guarantee only indirect technically, so we do a little check

    # Get the list of nodes-- this is the first change
    nodes = list(G.nodes())  
    indirect_ties = [(nodes[i], nodes[j]) for i in range(len(nodes)) for j in range(len(nodes)) 
    # the only difference here is that the tuple is composed of the elements now
                     if i != j and A_but_squared[i, j] > 0 and A_but_np[i, j] == 0]
    
    return indirect_ties

While there isn't a specific NetworkX function for it, you can still create your own using NetworkX (I'll explain in the comments– sorry if verbose):

import networkx as nx

def find_indirect(G): #G is a graph

    # We create an empty dictionary to hold all of the pairs which have indirect ties

    indirect_ties = {}
    
    # Iterate over all pairs of nodes in our graph
    for x in G.nodes():
        for y in G.nodes():
            if x != y and not G.has_edge(x, y):  # That means 1. the nodes are not the same node and no edge exists between them
                neighbors_of_x = set(G.neighbors(x))
                neighbors_of_y = set(G.neighbors(y))
                
                # Finding the intersection of the sets creates the unique set of neighbors shared between the two
                common_neighbors = neighbors_of_x.intersection(neighbors_of_y)
                
                if common_neighbors: # Obviously if not common_neighbors is true obviates this– this is the check for being indirect now
                    indirect_ties[(x, y)] = list(common_neighbors) # Now we have each key in the dict as a tuple of the two nodes that have indirect ties, and the value for the key is what is shared
                    
    return indirect_ties

Obviously, you can change this up a bit. I did the dictionary approach to show how you can also store the common neighbors that are responsible for making the nodes indirect. However, you can just as easily modify the code and store only the indirect nodes.

0
Mad Physicist On

You can use adjacency matrices, for example as shown here: https://math.stackexchange.com/q/2696090/295281. In networkx, you can use nx.adjacency_matrix or nx.to_scipy_sparse_array:

def find_second_order(G):
    m = nx.to_scipy_sparse_array(G)
    g = nx.to_scipy_sparse_array(m @ m, create_using=type(G))
    return nx.relabel_nodes(g, dict(zip(range(len(G.nodes(), G.nodes())))))

The result is a graph whose edges indicate a second-order connection between nodes in the input.

You can make variants of this function that handle additional cases. So for example if you wanted all nodes connected by exactly n steps, you could repeatedly do something like this (Matrix power for sparse matrix in python):

def find_nth_order(G, n=2):
    def mpow(m, p):
        if p == 1:
            return m
        if p == 0:
            return 1
        t = mpow(p // 2)
        return m @ t @ t if p % 2 else t @ t

    m = nx.to_scipy_sparse_array(G)
    g = nx.to_scipy_sparse_array(mpow(m, n), create_using=type(G))
    return nx.relabel_nodes(g, dict(zip(range(len(G.nodes(), G.nodes())))))