confusionMatrix for a classifier in R

12.9k Views Asked by At

I am using the confusionMatrix function from the caret library in R to evaluate the performance of a couple of methods such as (elasticnet from glmnet library, gaussian processors from kernlib, randomforest ) on a two class data.

I can see sometimes for some of the methods, I am getting

Warning message: In confusionMatrix.default(pred, truth) : Levels are not in the same order for reference and data. Refactoring data to match.

and the performance is e.g 65 percent; however, if I relabel the levels (change the orders) of the predictions (in above example, pred), based on the "truth"; the performance becomes 25%.

I constructed the following toy data.

pred = c("a", "a", "a", "b")
pred = as.factor(pred)
levels(pred) = rev(levels(pred)) % given this line, I can either get 25% or 75%.

truth = c("a", "a", "b", "b")
truth = as.factor(truth)

confusionMatrix(pred, truth)

I understand it is intuitive, since it is a two-classed data. However, I wonder, if I do such to my favour; meaning if the performance is 25% (simply, accepting it as 75%).

1

There are 1 best solutions below

3
On

See ?caret::confusionMatrix, specifically the parameter positive

positive an optional character string for the factor level that corresponds to a "positive" result (if that makes sense for your data). If there are only two factor levels, the first level will be used as the "positive" result.

On a second note, unless you're classes are roughly 50-50 you should probably evaluate your results with something other than a confusion matrix.