I have a subset of the MNIST handwritten digits dataset. I'm trying to reduce the dimensions using PCA, kernel pca, lle and tsne while plotting the result usign Plotly.express.scatter_3d. But as a beginner, I don't know how to interpret from the figure. Please guide me.
pca = PCA(n_components=3)
X_pca = pca.fit_transform(X_train)
X_pca_r = pca.inverse_transform(X_pca)
import plotly.express as px
fig = px.scatter_3d(X_pca, x=X_pca[:,0], y=X_pca[:,1], z=X_pca[:,2], color=y_train)
fig.show()
I have the following figure
Then, using KernelPCA:
from sklearn.decomposition import KernelPCA
kpca = KernelPCA(n_components=3, fit_inverse_transform=True)
X_kpca = kpca.fit_transform(X_train)
X_kpca_r = kpca.inverse_transform(X_kpca)
px.scatter_3d(X_kpca, x=X_kpca[:,0], y=X_kpca[:,1], z=X_kpca[:,2], color=y_train).show()
results in this figure:
Similarly, using LocallyLinearEmbedding:
from sklearn.manifold import LocallyLinearEmbedding
lle = LocallyLinearEmbedding(n_components=3)
X_lle = lle.fit_transform(X_train)
px.scatter_3d(X_lle, x=X_lle[:,0], y=X_lle[:,1], z=X_lle[:,2], color=y_train).show()
results in the following figure:
Lastly, using TSNE:
from sklearn.manifold import TSNE
tsne = TSNE(n_components=3)
X_tsne = tsne.fit_transform(X_train)
px.scatter_3d(X_tsne, x=X_tsne[:,0], y=X_tsne[:,1], z=X_tsne[:,2], color=y_train).show()
results in the following figure:
Please feel free to comment if I misunderstood your question, I would very much try to condense the answer, if you tell the specific part that is troubling
In my experience, 3 dimensions will not be enough to classify handwritten digits very well, in the same way that a 3-pixel display will not be able to represent digits in a way that resembles how the digits look, when written by hand. This is why the graphs might not intuitively make sense (although points of same colour, corresponding to the digit, are somewhat grouped in the graphs, for example the yellow spherres, which are the digit 9.)
In other datasets, where 3 features is enough to classify the data, you might see that the data forms distinct clusters. The larger the distance between the clusters (intracluster distance), and the smaller the distance between points in the same cluster (intercluster distance), the better. A much used example is the Iris flower dataset:
Data: https://www.kaggle.com/datasets/arshid/iris-flower-dataset
Example, with visualistion: https://www.kaggle.com/code/imdevskp/plotly-express-3d-scatter-plot-iris-data/notebook
This page shows the concepts of cluster distances quite well: https://www.geeksforgeeks.org/ml-intercluster-and-intracluster-distance/
The figures are 2-dimensional, but the basic principles work in higher dimensions.
I would recommend that you look into numerical indicators rather than figures, as most problems works best with more than 3 dimensions, which can't be shown on a figure.
In continuation of this, you should also look into how the packages can show the significance of each principal component/dimension, to better determine how many features to include in the analysis.
Lastly, I would recommend that you adjust the size of the spheres in your graphs, so that they do not overlap eachother as much, although it is difficult with a large number of datapoints.