How to select optimal number of components for NMF in python sklearn?

1.4k Views Asked by At

There is not a built-in function in python's sklearn to do this.

In my research I found out that a "precision score" err(components) can be calculated via

enter image description here

The optimal number of components will have the minimum err(c).

Given the below test code, how can the precision score be implemented in python?

import numpy as np
import pandas as pd
from sklearn.decomposition import NMF
X = np.random.rand(40, 100) # create matrix for NMF
c = 4
model = NMF(n_components=c, init='random', random_state=0)
W = model.fit_transform(X)
H = model.components_
2

There are 2 best solutions below

0
On

I'm not sure about the transposition in your formula since sklearn seems to transpose already H but this should do the trick

err = np.linalg.norm(X - W @ H)**2/np.linalg.norm(X)**2
print(err)
0
On

Simply use the built in function "reconstruction error" as follows:

err = model.reconstruction_err_