I am performing unsupervised clustering with PCA. The first 7 Components contain 50, 13, 9, 8, 5, 3, 3% of the variance.
There is no feature that stands out in PC1. However there are some stand out features in the remaining PCs in terms of the loadings.
When I compare my results to the ground truth, the clustering is poor. If I exclude PC1, my results improve a bit.
Why is it that my clustering algorithm discriminates better when I exclude PC1 scores from the input data? And is this okay to do - ie: leaving out 50% variance of the original data.
Thanks
Clustering with PCA with and without PC1 included in the input data.