Plot a tree-graph with ggplot

2.3k Views Asked by At

I have hierarchical data in this form:

df <- data.frame(root=rep("unclustered",22),
  itr1=paste0("1.",c(1,5,1,2,4,1,3,2,5,5,6,9,4,3,4,8,5,7,3,2,10,8)),
  itr2=paste0("2.",c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,10,17,18,19,20,21)),
  itr3=paste0("3.",c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22)),stringsAsFactors = F)

Which describes iterative cluster assignment of data points - the rows. The first column, root, assigns all points to the root cluster, then each column is an iteration of the clustering which takes the clusters of the previous iteration and breaks them down to further sub-clusters.

I'm trying to plot this process using a tree network.

I know that using data.tree, I can simply do:

df$pathString <- do.call(paste,c(df,sep="/"))
df.graph <- data.tree::as.Node(df)
plot(df.graph)

enter image description here

But I'm looking for something a bit fancier, preferably with a ggplot look.

So I converted df.graph to an igraph object:

df.igraph <- data.tree::as.igraph.Node(df.graph)

And tried to use ggraph:

library(ggraph)
ggraph(df.igraph, 'igraph', algorithm = 'tree') + 
  geom_edge_link() +
  ggforce::theme_no_axes()

enter image description here

Any idea how to get the ggraph option to include the nodes with their labels, add arrows to the edges, and perhaps color each level differently?

1

There are 1 best solutions below

1
On

This seems to get me close:

V(df.igraph)$class <- names(V(df.igraph))

ggraph(df.igraph,layout='tree')+
  geom_edge_link(arrow=arrow(length=unit(2,'mm')),end_cap=circle(3,'mm'))+
  geom_node_label(aes(label=class))+
  theme_void()

enter image description here