I have hierarchical data in this form:
df <- data.frame(root=rep("unclustered",22),
itr1=paste0("1.",c(1,5,1,2,4,1,3,2,5,5,6,9,4,3,4,8,5,7,3,2,10,8)),
itr2=paste0("2.",c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,10,17,18,19,20,21)),
itr3=paste0("3.",c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22)),stringsAsFactors = F)
Which describes iterative cluster assignment of data points - the rows. The first column, root
, assigns all points to the root cluster, then each column is an iteration of the clustering which takes the clusters of the previous iteration and breaks them down to further sub-clusters.
I'm trying to plot this process using a tree network.
I know that using data.tree
, I can simply do:
df$pathString <- do.call(paste,c(df,sep="/"))
df.graph <- data.tree::as.Node(df)
plot(df.graph)
But I'm looking for something a bit fancier, preferably with a ggplot
look.
So I converted df.graph
to an igraph
object:
df.igraph <- data.tree::as.igraph.Node(df.graph)
And tried to use ggraph
:
library(ggraph)
ggraph(df.igraph, 'igraph', algorithm = 'tree') +
geom_edge_link() +
ggforce::theme_no_axes()
Any idea how to get the ggraph
option to include the nodes with their labels, add arrows to the edges, and perhaps color each level differently?
This seems to get me close: