How to evaluate the information retained in UMAP?

2.5k Views Asked by ghost At 13 December 2020 at 05:36

I tried to find an attribute similar to explained_variance_ratio (in PCA in sklearn) for UMAP but am unable to find such a thing. In PCA, I could use explained_variance_ratio for different values of n_components and compare the results. Is there any such thing that I can use for UMAP in python?

Original Q&A

There are 1 best solutions below

StupidWolf On 13 December 2020 at 08:09 BEST ANSWER

You cannot easily estimate the variance explained by UMAP because it is a form of nonlinear dimension reduction, compared to PCA. Below is a more detailed dive.

PCA tries to find projections in the high-dimensional space that captures as much variance as possible. You project data onto these orthogonal planes, and you can estimate the variance captured by each, as compared to the variance in the original data. It is throughout, a linear operation, so you define the variance explained. You can check out this post about variance explained or this about PCA

UMAP is a form of nonlinear dimension reduction. From the help page, UMAP uses so called simplicial complexes to capture the topological space of your features, and from there obtain a low dimensional reduction. You can think of it as a high dimensionl graph that more geared towards capturing the inter-connectedness between data points than the variance. Hence, as of now, I am not aware of a way to retrieve the variance explained in a UMAP. You can also check out the author's reply on github.

How to evaluate the information retained in UMAP?

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in MACHINE-LEARNING

Related Questions in SCIKIT-LEARN

Related Questions in DATA-SCIENCE

Related Questions in DIMENSIONALITY-REDUCTION

Trending Questions

Popular # Hahtags

Popular Questions