Missing value imputation using the fancyimpute KNN package in python

1.4k Views Asked by At

I am trying to use the KNN package for imputing the missing values I have in my dataframe. My dataframe columns have different ranges i.e some of them are much greater in value than others.

My understanding is that the KNN algorithm uses the Euclidean distance to determine the nearest neighbors. My doubt is if I should normalize the data before feeding it to the algorithm or if it does so by default?

1

There are 1 best solutions below

0
On

You can see here in the fancyimpute.knn.KNN class in the code that it takes an attribute normalizer which can be set to any object with fit() and transform() methods.

By default it is set to None so you'll have to explicitly create a normalizer and feed it to the KNN class object.