I have a dataset that contains information about patients. It includes several variables and their clinical status (0 if they are healthy, 1 if they are sick). I have tried to implement an SVM model to predict patient status based on these variables.
library(e1071)
Index <-
order(Ytrain, decreasing = FALSE)
SVMfit_Var <-
svm(Xtrain[Index, ], Ytrain[Index],
type = "C-classification", gamma = 0.005, probability = TRUE, cost = 0.001, epsilon = 0.1)
preds1 <-
predict(SVMfit_Var, Xtest, probability = TRUE)
preds1 <-
attr(preds1, "probabilities")[,1]
samples <- !is.na(Ytest)
pred <- prediction(preds1[samples],Ytest[samples])
AUC<-performance(pred,"auc")@y.values[[1]]
prediction <- predict(SVMfit_Var, Xtest)
xtab <- table(Ytest, prediction)
To test the performance of the model, I have calculated the ROC AUC, and with the validation set I obtain an AUC = 0.997. But when I view the predictions, all the patients have been assigned as healthy.
AUC = 0.997
> xtab
prediction
Ytest 0 1
0 72 0
1 52 0
Can anyone help me with this problem?
Did you look at the probabilities versus the fitted values? You can read about how probability works with SVM here.
If you want to look at the performance you can use the library
DescToolsand the functionConfor with the librarycaretand the functionconfusionMatrix. (They provide the same output.)Your question isn't reproducible, so I went through this with
irisdata. The probability was the same for every observation. I included this, so you can see this with another data set.