R Imputation With MICE

395 Views Asked by At
set.seed(1)
    library(data.table)
    data=data.table(STUDENT = 1:1000,
                    OUTCOME = sample(20:90, r = T),
                    X1 = runif(1000),
                    X2 = runif(1000),
                    X3 = runif(1000))
    data[, X1 := fifelse(X1 > .9, NA_real_, X1)]
    data[, X2 := fifelse(X2 > .78 & X2 < .9, NA_real_, X1)]
    data[, X3 := fifelse(X3 < .1, NA_real_, X1)]

Say you have data as shown and you wish to impute values for X1, X2, X3 and leave out STUDENT and OUTCOME for the imputation processing.

I can do

library(mice)
dataIMPUTE=mice(data[, c("X1", "X2", "X3")], m = 1)

but how do I get together the imputing values from dataIMPUTE with STUDENT and OUTCOME? I am afraid that I will merge wrong and that is why I ask if you have advice for this.

1

There are 1 best solutions below

0
On

One possibility is to use the complete data set in the imputation, but change the predictorMatrix so that STUDENT and OUTCOME are not used in the imputation model.

First, you need to run mice to extract the predictorMatrix (without calculating the imputation). Then you can set all columns to 0 that shouldn't be included in the imputation model. However, all your variables are still contained in your dataIMPUTE object:

set.seed(1)
library(data.table)
data=data.table(STUDENT = 1:1000,
                OUTCOME = sample(20:90, r = T),
                X1 = runif(1000),
                X2 = runif(1000),
                X3 = runif(1000))
index_1 <- sample(1:1000, 100)
index_2 <- sample(1:1000, 100)
index_3 <- sample(1:1000, 100)
data[index_1, X1 := NA_real_]
data[index_2, X2 := NA_real_]
data[index_3, X3 := NA_real_]

library(mice)
init <- mice(data, maxit = 0, print = FALSE)

# extract the predictor matrix
pred_mat <- init$predictorMatrix

# remove STUDENT and OUTCOME as predictors
pred_mat[, c("STUDENT", "OUTCOME")] <- 0

# do the imputation
dataIMPUTE = mice(data, pred = pred_mat, m = 1)