How to add ellipse/polygon and legend in the scatterplot 3D in r?

637 Views Asked by At

I did kmeans clustering 3D using scatterplot3d where I computed optimal number of cluster 4. Now I did below code to make 3d kmeans diagram. but i don't know how add legend (Cluster 1, Cluster 2, ...) and polygon or ellipse in the diagram.

library(scatterplot3d)
df = read.csv(file.choose(), header = T)
str(df)
set.seed(137)
km = kmeans(df, 4, nstart=25)
df$cluster = factor(kmeans(df, 4)$cluster)
shapes = c(15, 17, 18, 19)
shapes = shapes[as.numeric(km$cluster)]
zz <- scatterplot3d(df[,-1], pch = shapes, grid = F, color = rainbow(4)[km$cluster])
zz.coords <- zz$xyz.convert(df[,-1]) 
text(zz.coords$x, 
     zz.coords$y,             
     labels = df[,1],               
     cex = .5, 
     pos = 4)

data I have used

df = structure(list(Station = c(91L, 92L, 94L, 95L, 96L, 97L, 98L, 
99L, 103L, 105L, 106L, 107L, 112L, 113L, 114L, 115L, 116L, 117L, 
119L, 122L, 123L), PC1 = c(-1.12015813, -0.162603353, -0.691170969, 
1.290790838, -1.544758302, -0.027602401, -0.614168423, -0.495333696, 
1.279949324, -0.63809094, -0.122668147, 0.744142235, -0.778789917, 
-2.633418969, 0.758488847, -1.066042887, -0.721002477, 1.079359567, 
-0.175695907, 5.318351482, 0.320422224), PC2 = c(0.415530088, 
0.666545002, 0.695404632, -0.554688073, 0.222592134, 0.697855329, 
1.242555625, 0.64505761, 0.702722431, 0.86363373, 0.827648227, 
1.204083925, -4.232859175, -1.636251717, -0.685291914, 1.16637059, 
-0.122979187, -1.485372609, 0.079916797, -0.402557579, -0.309915867
), PC3 = c(0.698238358, -0.071855245, -0.563762392, -0.382756976, 
0.224635711, -0.786947583, -1.983627389, -0.032803567, -0.465769148, 
-0.076804882, -1.775067301, -0.203906481, -0.419354014, -0.121035409, 
-1.171532233, 3.578306494, 0.91365978, 0.951076125, 0.516367241, 
0.770666924, 0.402271986), cluster = structure(c(2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 4L, 4L, 4L, 4L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 
1L, 1L), .Label = c("1", "2", "3", "4"), class = "factor")), row.names = c(NA, 
-21L), class = "data.frame")
1

There are 1 best solutions below

2
On

You could use the legend() function with scatterplot3d(), but I don't think there's a way to draw the ellipses. To do that you could switch to using the rgl package. After running your code, this will do the plot:

shapes <- c(15, 17, 18, 19)
shapes1 <- shapes[as.numeric(km$cluster)]

library(rgl)

plot3d(df[,-1], type = "n")
pch3d(df[,-1], pch = shapes1, col = rainbow(4)[km$cluster], cex = 0.5)
text3d(df[,-1], text = df[,1], pos=4)

for (i in 1:4) {
  cl <- subset(df, cluster == i)
  if (nrow(cl) > 3) {
    sigma <- cov(cl[,2:4])
    mu <- apply(cl[,2:4], 2, mean)
    shade3d(ellipse3d(sigma, centre = mu), alpha = 0.2, col = rainbow(4)[i])
  }
}

legend3d("right", pch = shapes, col = rainbow(4), 
         legend = paste("Cluster", 1:4), cex=2)

After some fiddling with the size of the window and rotating the plot, this produces

enter image description here

It doesn't draw an ellipsoid for cluster 1, because you only have 3 points there: you need at least 4 points not all in a plane to get a non-degenerate ellipsoid.