I am trying to build a receiver operating characteristic (ROC) curves to evaluate the discriminating ability of my classifier to correctly classify diseased and non-diseased subjects.
I understand that the closer the curve follows the left-hand border and then the top border of the ROC space, the more accurate the test. My experiments gave me quite desirable value of area under curve (auc), i.e. 0.86458. However, the ROC curve (in which I included the cut-off points for tracing purposes) seems quite strange as it gave me straight lines as below:
... and not a curve I expected and as I normally see from any references like this:
Does it hav something to do with the number of observations used? (in this case I only have 50 samples). Or is this just fine as long as the the auc value is high and that the 'curve' comes above the 45-degree diagonal of the ROC space? I would be glad if someone can share their thoughts about it. Thank you!
By the way, I used the perfcurve() function in matlab:
% ROC comparison between the proposed approach and the baseline
[X1,Y1,T1,auc1,OPTROCPT1,SUBY5,SUBYNAMES1] = perfcurve(testLabel,predlabel_prop,1);
[X2,Y2,T2,auc2,OPTROCPT2,SUBY2,SUBYNAMES2] = perfcurve(testLabel,predLabel_base,1);
figure;
plot(X1,Y1,'-r*',X2,Y2,'--ko');
legend('proposed approach','baseline','Location','east');
xlabel('False positive rate'); ylabel('True positive rate')
title('ROC comparison of the proposed approach and the baseline')
text(0.6,0.3,{'* - proposed method',strcat('Area Under Curve = ',...
num2str(auc1))},'EdgeColor','r');
text(0.6,0.15,{'o - baseline',strcat('Area Under Curve = ',num2str(auc2))},'EdgeColor','k');
Shape of your curve is just a result of high explanatory power of your model and limited number of observations (e.g. take a look at the example here http://nl.mathworks.com/help/stats/perfcurve.html).