Labeling a particular cluster in K-means in R

215 Views Asked by At

What are the modifications in code required if I want to label only datapoints in cluster 3?

> library(datasets)
head(iris)
library(ggplot2)
ggplot(iris, aes(Petal.Length, Petal.Width, color = Species)) + geom_point()
set.seed(20)
irisCluster <- kmeans(iris[, 3:4], 3, nstart = 20)
irisCluster

table(irisCluster$cluster, iris$Species)
    setosa versicolor virginica

irisCluster$cluster <- as.factor(irisCluster$cluster)
ggplot(iris, aes(Petal.Length, Petal.Width, color = irisCluster$cluster)) + geom_point()`
2

There are 2 best solutions below

0
On BEST ANSWER

You can turn the labels as blank for cluster which is not 3. You may need to adjust the position of labels based on your actual data.

library(dplyr)
library(ggplot2)

iris %>%
  mutate(cluster = irisCluster$cluster, 
         label = replace(Petal.Length, cluster != 3, '')) %>%
  ggplot() + aes(Petal.Length, Petal.Width, color = cluster, label = label) + 
  geom_point() + geom_text(vjust = -0.5, hjust = -0.4)
0
On

Your question is somewhat ambiguous, but if you want to highlight points in a specific cluster you can use the gghighlight package, e.g.

library(datasets)
library(ggplot2)
#install.packages("gghighlight")
library(gghighlight)

set.seed(20)
irisCluster <- kmeans(iris[, 3:4], 3, nstart = 20)
irisCluster

table(irisCluster$cluster, iris$Species)

iris$cluster <- as.factor(irisCluster$cluster)
  ggplot(iris, aes(Petal.Length, Petal.Width, color = factor(cluster))) +
    geom_point() +
  gghighlight(cluster == 3, keep_scales = TRUE)

example.png