K value vs Accuracy in KNN

1k Views Asked by At

I am trying to learn KNN by working on Breast cancer dataset provided by UCI repository. The Total size of dataset is 699 with 9 continuous variables and 1 class variable.

I tested my accuracy on cross-validation set. For K =21 & K =19. Accuracy is 95.7%.

from sklearn.neighbors import KNeighborsClassifier
neigh = KNeighborsClassifier(n_neighbors=21)
neigh.fit(X_train, y_train) 
y_pred_val = neigh.predict(X_val)
print accuracy_score(y_val, y_pred_val)

But for K= 1, I am getting Accuracy = 97.85% K = 3, Accuracy = 97.14

I read

Choice of k is very critical – A small value of k means that noise will have a higher influence on the result. A large value make it computationally expensive and kinda defeats the basic philosophy behind KNN (that points that are near might have similar densities or classes ) .A simple approach to select k is set k = n^(1/2). here

Which value of K should I consider for my model. Can you guys elaborate the logic behind it?

Thanks in advance!

0

There are 0 best solutions below