R Confussion Matrix - Error: `data` and `reference` should be factors with the same levels

57 Views Asked by At

Although there are other reports for the same error message none is helping in my case.

I have prepared my own data, splitted as below but it is impossible to obtain the confussion matrix.

test_index <- createDataPartition(y = workingData$PM10, times = 1, p = 0.5, list = FALSE)
train_set <- workingData[-test_index,]
test_set <- workingData[test_index,]

train_knn <- train(PM10 ~. , method= "knn" , data = train_set)

y_hatknn <- predict(train_knn, train_set, type = "raw")

confusionMatrix(y_hatknn, test_set$PM10)

The last line above gives

Error: `data` and `reference` should be factors with the same levels.

I would like to upload the data for reproduction, but can provide the basic:

str(workingData)
'data.frame':   3653 obs. of  3 variables:
' $ Date   : num  2e+07 2e+07 2e+07 2e+07 2e+07 ...
' $ Rain_mm: num  0.1 6.7 0 1.4 0.8 1.8 15.3 0 2.6 3.8 ...
' $ PM10   : num  -1 -1 -1 -1 -1 ...

PM10 being pollution PM10 levels.

How to resolve it?

Adding more info:

After the original error:

confusionMatrix(y_hatknn, test_set$PM10) Error: data and reference should be factors with the same levels.

I try to set as factor...

confusionMatrix(y_hatknn, as.factor(test_set$PM10)) Error: data and reference should be factors with the same levels.

With the prediction as factor...

confusionMatrix(as.factor(y_hatknn), test_set$PM10) Error: data and reference should be factors with the same levels.

With both parameters as factors...

confusionMatrix(as.factor(y_hatknn), as.factor(test_set$PM10)) Error in confusionMatrix.default(as.factor(y_hatknn), as.factor(test_set$PM10)) : the data cannot have more levels than the reference

Really need to get is sorted out

0

There are 0 best solutions below