Hello I have written below simple code to explore the Fuzzy Cmean clustering
import pandas as pd
import numpy as np
from os import listdir
from sklearn.model_selection import train_test_split
from skfuzzy.cluster import cmeans, cmeans_predict
from sklearn.metrics import classification_report,confusion_matrix
def find_csv_filenames( path_to_dir, suffix=".csv" ):
filenames = listdir(path_to_dir)
return [ path_to_dir+filename for filename in filenames if filename.endswith( suffix ) ]
listFiles = find_csv_filenames('<Path to folder with csv files>')
for files in listFiles:
df = pd.read_csv(files)
df.loc[df['bug']>1,'bug']=1
df2 =df.iloc[:,3:]
#Above are some pre processing steps
#Below splitting data for test and train
X_train, X_test = train_test_split(df2, test_size=0.30)
#dropping bug column for unsupervised learning
X_train2 = X_train.drop('bug',axis=1)
X_test2 = X_test.drop('bug',axis=1)
print (X_train2.shape)
#Shape is 163,20 for 163 training data with 20 features
cntr, u, u0, d, jm, p, fpc = cmeans(X_train2,2,2,0.25,500,init=None, seed=None)
print(cntr.shape)
#above shape is coming 2,163
The Centre which is coming from the above cmeam algo is having size of (2,163) but since my training data is having only 20 feature, hence the shape of the cntr should have been (2,20). Unable to understand where i am wrong
From the
skfuzzy
documentation:So you need to transpose your input, not tested but:
Should work.