Data Preperation for Clustering of Data with very low variance

245 Views Asked by At

i tried to Cluster data of an production machine. I am trying K-Means, DB-SCAN and OPTICS. With all algorithms, the results are very bad (e.g. silhouette coefficient of 0.05).

From my ponit of view, the data has very low variance. I already did a PCA an the first two principal componenced accounted for just 6 % of varaince of the dataset. The picture below shows the histogram and scatter-plot of the first two principal components.

For data preperation I tried standardization, min-max-scaling, feature selection with variance threshold (sklearn), univariate feature selection (sklearn) and as said PCA. The results aren't satisfactory.

So my question is if there are any other methods of Data Preperation that you could think are helpfull for my clutering. Or if the data is simply not good for doing clustering:D

Thanks for every comment!

First two principal components

K-Means++ with two features

K-Means++ with five features

0

There are 0 best solutions below