Sankey Diagram for transitions

612 Views Asked by At

I am trying to replicate the code and issue from below stack overflow question Sankey diagram in R

Adding some sample data

head(links) #Data.frame

Source   Target  Weight 
 Fb        Google  20 
 Fb         Fb      2
 BBC        Google 21
 Microsoft  BBC    16 

head(nodes) 
Fb
BBC
Google
Microsoft 

Code for building a sankey transition flow

sankeyNetwork(Links = links, 
              Nodes = nodes, 
              Source = "Source",
              Target = "Target", 
              Value = "value", 
              fontSize = 12, 
              nodeWidth = 30)

The above mentioned stack overflow posts mentions that the source and target should be indexed at 0. However if I try the same syntax, I get NA's in my Source and Target. What could be causing this error?

2

There are 2 best solutions below

0
On BEST ANSWER

You can convert your Source and Target variables in your links data frame to the index of the nodes in your nodes data frame like so...

links <- read.table(header = T, text = "
Source   Target  Weight
Fb        Google  20
Fb         Fb      2
BBC        Google 21
Microsoft  BBC    16
")

nodes <- read.table(header = T, text = "
name
Fb
BBC
Google
Microsoft
")

# set the Source and Target values to the index of the node (zero-indexed) in
# the nodes data frame
links$Source <- match(links$Source, nodes$name) - 1
links$Target <- match(links$Target, nodes$name) - 1

print(links)
print(nodes)

# use the name of the column in the links data frame that contains the values
# for the value you pass to the Value parameter (e.g. "Weight" not "value")
library(networkD3)
sankeyNetwork(Links = links, Nodes = nodes, Source = "Source", 
              Target = "Target", Value = "Weight",
              fontSize = 12, nodeWidth = 30)
0
On

This code produced the plot at the bottom. See my comments for the explanation of changes from your code. And, a wonderful resource is here: several methods with R to create Sankey (river) plots.

library(networkD3)  

# change to numeric index starting at 0.  I assigned Fb to zero, and so on
links <- data.frame(Source = c(0, 0, 1, 2),
                     Target = c(3, 0, 3, 1),
                     Weight = c(20, 2, 21, 16))

# a nodes dataframe (or dataframe element of a list, as in the help) is needed
nodes <- data.frame(name = c("Fb", "Google", "BBC", "MS"))

sankeyNetwork(Links = links, 
              Nodes = nodes, 
              Source = "Source",
              Target = "Target", 
              Value = "Weight",   # changed from "value"
              fontSize = 12, 
              nodeWidth = 30)   

enter link description here