Ranger predict incorrect number of dimensions in R

314 Views Asked by At

Issues with evaluating ranger. In both, unable to subset the data (want the first column of rf.trnprob)

rangermodel= ranger(outcome~., data=traindata, num.trees=200, probability=TRUE)
rf.trnprob= predict(rangerModel, traindata, type='prob')


trainscore <- subset(traindata, select=c("outcome"))
trainscore$score<-rf.trnprob[, 1]  

Error:

incorrect number of dimensions

table(pred = rf.trbprob, true=traindata$outcome)

Error:

all arguments must have the same length

1

There are 1 best solutions below

0
On

Seems like the predict function is called wrongly, it should be response instead of type. Using an example dataset:

library(ranger)
traindata =iris
traindata$Species = factor(as.numeric(traindata$Species=="versicolor"))
rangerModel = ranger(Species~.,data=traindata,probability=TRUE)
rf.trnprob= predict(rangerModel, traindata, response='prob')

Probability is stored here, one column for each class:

head(rf.trnprob$predictions)
             0           1
[1,] 1.0000000 0.000000000
[2,] 0.9971786 0.002821429
[3,] 1.0000000 0.000000000
[4,] 1.0000000 0.000000000
[5,] 1.0000000 0.000000000
[6,] 1.0000000 0.000000000

But seems like you want to do a confusion matrix, so you can get the predictions by doing:

pred = levels(traindata$Species)[max.col(rf.trnprob$predictions)]

Then:

table(pred,traindata$Species)
pred   0   1
   0 100   2
   1   0  48