caret::train Duplicating predictor variables

27 Views Asked by At

In R, when I try to build a rpart CTree with a caret using:

tree <- caret::train(LoanStatus ~ ., data = home_training, method = "rpart")

Everything is fine until I try to predict:

predictions <- predict(tree$finalModel, newdata = home_validation, type = "class")

Which gives me the error: Error in eval(predvars, data, env): object 'Gender1' not found

Then, I notice that R has duplicated some of my predictor variables (they are factors):

varImp(tree) outputs:

ApplicantIncome    2.218022
CoapplicantIncome    4.564288
Education1    6.214741
LoanAmount    7.183707
LoanAmountTerm    1.554240
Married1    6.895146
PropertyArea1    5.806154
Gender1    0.000000
Dependents1    0.000000
Dependents2    0.000000
Dependents3    0.000000
SelfEmployed1    0.000000
PropertyArea2    0.000000

Which contains a lot of duplicates.

If I do the same using rpart directly with: tree2 <- rpart(LoanStatus ~ ., home_training, method = "class")

Does not gives me any errors and also has no duplicate variables.

I wanted to do it using a caret, because it allows to use cross validation.

How can I fix this?

0

There are 0 best solutions below