One recommended method for getting a good cluster solution is to first use a hierarchical clustering method, choose a number of clusters, then extract the centroids, and then rerun it as a K-means clustering algorithm with the centres pre-specified. A toy example:
library(cluster)
data(animals)
ag.a <- agnes(agriculture, method = "ward")
ag.2 <- cutree(ag.a, k = 2)
This would give me two clusters. How can I extract the cluster centres in a format that I can then put into the kmeans()
algorithm and reapply it to the same data?
You can use the clustering to assign cluster membership and then calculate the center for all the observations in a cluster. The
kmeans
function allows you to specify initial centers via thecenters=
parameter if you pass in a matrix. You can do that with