Imputing with NN left NaN in the data

28 Views Asked by At

I tried imputing values from a dataset with the Nearest Neighbour and it did it for almost all the NaNs, but it missed two.

I'm working on the Titanic dataset and I'm trying to impute the ages in my test set with the NN method like this:

titanic_Test['age'] = titanic_Test['age'].interpolate(method='nearest')

And it does it well, except that this two values are left as NaN:

    pclass  sex age sibsp   parch           fare           cabin    embarked    

416 2.0 0.0 NaN 0.000 0.000000 0.015713 0 S
417 2.0 0.0 NaN 0.125 0.111111 0.043640 0 C

Why?

1

There are 1 best solutions below

0
Rutvi Rajesh On

Sometimes, the Nearest Neighbour method can't fill in missing ages because there might not be similar cases nearby in the data. This method looks for close matches to guess missing values based on other details. If there aren't enough similar cases with known ages, it can't guess accurately. In such situations, you might need to try different methods or think about other reasons why ages are missing.