knowledge graphs with tuples

1.1k Views Asked by At

My goal is to create a knowledge graph using a csv file which includes, source, edge and target. What I have tried so far:

  • it is not visible in the image, but I have two edges: 1) used for 2) related to.
  • I have target tuples with 20 words.

first image is what I would like to see as a format. second image is the head of my csv data file, the third image shows the failed graph visualization as a result of this code.

# create a directed-graph from a dataframe
import networkx as nx

G=nx.from_pandas_edgelist(tuple_predictions_IB_for_graph, "source", "target", 
                          edge_attr=True, create_using=nx.MultiDiGraph())
import matplotlib.pyplot as plt


plt.figure(figsize=(12,12))

pos = nx.spring_layout(G)
nx.draw(G, with_labels=True, node_color='skyblue', edge_cmap=plt.cm.Blues, pos = pos)
plt.show()

source

desired format of the outputted graph

actual resulting graph

1

There are 1 best solutions below

2
On BEST ANSWER

You should use the explode method of your dataframe to make an entry for each target in your rows so that each target aligns with its appropriate source, then you'll get the nodes as desired.

# Make sample data
tuple_predictions_IB_for_graph = pd.DataFrame({'source':['motherboard','screen','keyboard','bluetooth','webcam'],
                                               'edge':['related to']*4+['other_label'],
                                               'target':[['computer','keyboard','mouse','monitor'],
                                                         ['monitor','mouse','computer','tv'],
                                                         ['mouse','keyboard','monitor'],
                                                         ['toothe enamel','tooth decay','tooth fairy'],
                                                         ['webcam','camera','video camera','eyepiece']]})

# Explode the target column
tuple_df_exploded = tuple_predictions_IB_for_graph.explode(column = 'target')
tuple_df_exploded.head()
#         source        edge    target
# 0  motherboard  related to  computer
# 0  motherboard  related to  keyboard
# 0  motherboard  related to     mouse
# 0  motherboard  related to   monitor
# 1       screen  related to   monitor

# Make the graph accepting the 'edge' column as an edge attribute
g = nx.from_pandas_edgelist(tuple_df_exploded,
                            source='source',target='target',edge_attr='edge',
                            create_using=nx.MultiDiGraph())

pos = nx.spring_layout(g)
nx.draw_networkx(g,pos)
# draw the edges to make them a specific color based on the 'edge' attribute from the df
nx.draw_networkx_edges(g,pos,edgelist=g.edges(),
                       edge_color=[{'related to':'black',
                                    'other_label':'red'}[edge_label]
                                   for u,v,edge_label in g.edges(data='edge')]);

graph with colored edges