I'm looking for a fast NMF implementation for sparse matrices in R.
The R NMF
package consists of a number of algorithms, none of which impress in terms of computational time.
NNLM::nnmf()
seems state of the art in R at the moment, specifically the method = "scd"
and loss = "mse"
, implemented as alternating least squares solved by sequential coordinate descent. However, this method is quite slow on very large, very sparse matrices.
The rsparse::WRMF
function is extremely fast, but that's due to the fact that only positive values in A
are used for row-wise computation of W
and H
.
Is there any reasonable implementation for solving NMF on a sparse matrix?
Is there an equivalent to scikit-learn
in R? See this question
There are various worker functions, such as fnnls
, tsnnls
in R, none of which surpass nnls::nnls
(written in Fortran). I have been unable to code any of these functions into a faster NMF framework.
Forgot I even posted this question, but one year later...
I wrote a very fast implementation of NMF in RcppEigen, see the
RcppML
R package on CRAN.It's at least an order of magnitude faster than
NNLM::nnmf
and for comparison,RcppML::nmf
rivals the runtime ofirlba::irlba
SVD (although it's an altogether different algorithm).I've successfully applied my implementation to 1.3 million single-cells containing 26000 genes in a 96% sparse matrix for rank-100 factorization in 1 minute. I think that's very reasonable.