I am using the Silhouette Analysis in K-means Clustering, using the code found it here:
https://medium.com/@cmukesh8688/silhouette-analysis-in-k-means-clustering-cefa9a7ad111
However, when I run the code (using my own data frame) I get different results. In some cases I get
that the optimum number of clusters is 2 while others is 5. Can anyone explain why this happening?
KMeans algorithm starts to set randomly clusters centers before performing Gradient Descent.
Due to the stochastic nature of the algorithm, your data may be not well suited to use this.
Try to perform your analysis with setting random state to 0, at each iteration like:
Is this leading to the same optimum ?