I use the fuzzy-c-means clustering implementation and I would like the data X to form the number of clusters i define in the algorithm(I beleive that is how it works). But the behavior is confusing.
cm = FCM(n_clusters=6)
cm.fit(X)
This code generates a plot with 4 labels - [0,2,4,6]
cm = FCM(n_clusters=4)
cm.fit(X)
This code generates a plot with 4 labels - [0,1,2,3]
I expect labels [0,1,2,3,4,5] when i initialize the cluster number to be 6.
code:
from fcmeans import FCM
from matplotlib import pyplot as plt
from seaborn import scatterplot as scatter
# fit the fuzzy-c-means
fcm = FCM(n_clusters=6)
fcm.fit(X)
# outputs
fcm_centers = fcm.centers
fcm_labels = fcm.u.argmax(axis=1)
# plot result
%matplotlib inline
f, axes = plt.subplots(1, 2, figsize=(11,5))
scatter(X[:,0], X[:,1], ax=axes[0])
scatter(X[:,0], X[:,1], ax=axes[1], hue=fcm_labels)
scatter(fcm_centers[:,0], fcm_centers[:,1], ax=axes[1],marker="s",s=200)
plt.show()
I read about it and looks like once the algorithm reaches the knee point(max number of clusters it can perform with the data), it wont create anything more than this. So in my question, 4 was the maximum number of clusters that the algo perform with the given dataset.