Variable error rate of SVM Classifier using K-Fold Cross Vaidation Matlab

898 Views Asked by At

I'm using K-Fold Cross-validation to get the error rate of a SVM Classifier. This is the code with wich I'm getting the error rate for 8-Fold Cross-validation:

data = load('Entrenamiento.txt');
group = importdata('Grupos.txt');
CP = classperf(group);

N = length(group);
k = 8; 
indices = crossvalind('KFold',N,k); 
single_error = zeros(1,k);
    for j = 1:k 
        test = (indices==j);
        train = ~test;
        SVMModel_1 = fitcsvm(data(train,:),group(train,:),'BoxConstraint',1,'KernelFunction','linear');
        classification = predict(SVMModel_1,data(test,:)); 
        classperf(CP,classification,test); 
        single_error(1,j) = CP.ErrorRate;
    end
confusion_matrix = CP.CountingMatrix 
VP = confusion_matrix(1,1);
FP = confusion_matrix(1,2);
FN = confusion_matrix(2,1);
VN = confusion_matrix(2,2);
mean_error = mean(single_error)

However, the mean_error changes each time I run the script. This is due to crossvalind, which generates random cross-validation indices, so each time I run the script, it generates different random indices.

What should I do to calculate the true error rate? Should I calculate the mean error rate of n code executions? Or what value should I use?

1

There are 1 best solutions below

0
On BEST ANSWER

You can check wiki,

In k-fold cross-validation, the original sample is randomly partitioned into k equal size subsamples.

and

The k results from the folds can then be averaged (or otherwise combined) to produce a single estimation.

So no worries about different error rates of randomly selecting folds.

Of course the results will be different.

However if your error rate is in wide range then increasing k would help.

Also rng can be used to get fixed results.