Using cross-validation to find the right value of k for the k-nearest-neighbor classifier

218 Views Asked by At

I am working on a UCI data set about wine quality. I have applied multiple classifiers and k-nearest neighbor is one of them. I was wondering if there is a way to find the exact value of k for nearest neighbor using 5-fold cross validation. And if yes, how do I apply that? And how can I get the depth of a decision tree using 5-fold CV?

Thanks!

1

There are 1 best solutions below

0
On

I assume here that you mean the value of k that returns the lowest error in your wine quality model.

I find that a good k can depend on your data. Sparse data might prefer a lower k whereas larger datasets might work well with a larger k. In most of my work, a k between 5 and 10 have been quite good for problems with a large number of cases.

Trial and Error can at times be the best tool here, but it shouldn't take too long to see a trend in the modelling error.

Hope this Helps!