SVD with missing values in R

1.1k Views Asked by At

I am performing a SVD analysis with R, but I have a matrix with structural NA values. Is it possible to obtain a SVD decomposition in this case? Are there alternative solutions? Thanks in advance

2

There are 2 best solutions below

0
On

You might want to try out the SVDmiss function in SpatioTemporal package which does missing value imputation as well as computes the SVD on the imputed matrix. Check this link SVDmiss Function

However, you might want to be wary of the nature of your data and whether missing value imputation makes sense in your case.

0
On

I have tried using the SVM in R with NA values without succes. Sometimes they are important in analysis so I usually transform my data as follows:

  1. If you have lots of variables try to reduce their number (clustering, lasso, etc...)
  2. Transform the remaining predictors like this:

    - for quantitative variables:

       - calculate deciles per predictor (leaving missing obs out)
       - calculate frequency of Y per decile (assuming Y is qualitative)
       - regroup deciles on their Y freq similarity into 2/3/4 groups 
         (you can do this by looking at their plot too)
       - create for each group a new binary variable 
         (X11 = 1 if X1 takes values in the interval ...)
       - calculate Y frequency for missing obs of that predictor
       - join the missing obs category to the variable that has the closest Y freq 
    

    - for qualitative variables:

       - if you have variables with lots of levels you should do clustering by Y 
         variable
       - for variables with lesser levels, you can calculate Y freq per class
       - regroup the classes like above
       - calculate the same thing for missing obs and attach it to the most similar 
         group of non-missing
       - recode the variable as for numeric case*
    

There, now you have a complete database of dummy variables and the chance to perform SVM, neural networks, LASSO, etc...