I was trying to fit DBSCAN clustering algorithm in a dataset. The dataset heads looks like below :
[Summary Severity category component priority Issues]
and the sample data is like below :
[aaa_bbb_ccc @ABC 2.0 @August Regression - XYZ - ABCD - We have got the Income Reasonability Rule (V60300) but we are not getting the SSR (Self Service) Page for ABCD in XYZ environment. , 2-Major ,Application - Code , ABCD ,High ,We have got the Income Reasonability Rule (V60300) but we are not getting the SSR (Self Service) Page for ABCD in XYZ environment.]
...........
on doing so, I encountered "ValueError: could not convert string to float:while scaler.fit_transform" , which I resolved using https://stackoverflow.com/a/53473541/13659094
But since then I am facing a new issue as:
X1 = np.concatenate([transformed, X[:, 0:]], axis=1)
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\frame.py", line 2927, in __getitem__
indexer = self.columns.get_loc(key)
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 2657, in get_loc
return self._engine.get_loc(key)
File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 110, in pandas._libs.index.IndexEngine.get_loc
TypeError: '(slice(None, None, None), slice(0, None, None))' is an invalid key
The code looks like below :
scaler = StandardScaler()
X = final_data
X_sum = X.iloc[:,[0]]
onehotencoder = OneHotEncoder(handle_unknown='ignore')
transformed = onehotencoder.fit_transform(X_sum).toarray().reshape(1,-1)
X1 = np.concatenate([transformed, X[:, 1:]], axis=1)
X_scaled = scaler.fit_transform(X1).reshape(1,-1)
dbscan = DBSCAN(eps=0.123, min_samples = 2)
clusters = dbscan.fit_predict(X_scaled)
plt.scatter(X1[:, 0], X1[:, 1], c=clusters, cmap="plasma")
plt.xlabel("Feature 0")
plt.ylabel("Feature 1")
can anyone please help me with the solution on how to resolve the issue and I can fit DBSCAN algorithm to this dataset ?