How to create a X matrix out of a dataframe in R?

60 Views Asked by TFT At 19 October 2024 at 16:30

I have a dataframe called "prediction_set" which contains y and all possible predictors. From this dataframe, I want to generate the y vector and the X matrix. I've tried the following code but unfortunately, it only generates an empty matrix (although it displays the column names). How can I solve this?

#store dataframe
prediction_set <- subset(df_clean, is.na(df_clean$lnpercapitaconsumption))

#create X matrix and y vector for prediction set

X_prediction_set <- model.matrix(lnpercapitaconsumption ~ ., prediction_set)
y_prediction_set <- prediction_set$lnpercapitaconsumption

A sample of my dataframe can be found below:

> dput(prediction_set[1:20, c(1, 74)])
structure(list(lnpercapitaconsumption = c(NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_), h_hhsize = c(1L, 3L, 
4L, 9L, 8L, 3L, 6L, 5L, 4L, 1L, 5L, 1L, 4L, 1L, 2L, 3L, 4L, 6L, 
5L, 4L)), row.names = c(NA, 20L), class = "data.frame")

Original Q&A

There are 1 best solutions below

January On 11 November 2022 at 09:14

All your values are NA's, because you select a subset of df_clean with only these rows where lnpercapitaconsumption is NA). If you do the following, for example:

prediction_set$lnpercapitaconsumption <- rnorm(nrow(prediction_set))

(fill the variable with noise), you will see that the model matrix works as expected. Maybe you meant !is.na() rather than is.na()?

Or maybe you want to make predictions based on a training set model? In that case, you don't need the Y or a model.

How to create a X matrix out of a dataframe in R?

There are 1 best solutions below

Related Questions in R

Related Questions in DATAFRAME

Related Questions in MATRIX

Related Questions in MODEL.MATRIX

Trending Questions

Popular # Hahtags

Popular Questions