I got X_test values outside the range I specified in the normalization function, why I am getting those and how can I solve it? (This range [:,14:] in X_train and X_test where set because, in my dataset, the numerical values start in that column)
from sklearn.preprocessing import MinMaxScaler
scalar = MinMaxScaler(feature_range=(-1,1))
X_train[:,14:]=scalar.fit_transform(X_train[:,14:])
X_test[:,14:]=scalar.transform(X_test[:,14:])
By plotting the X_train and X_test, we can appreciate that the values in X_train are within the range, while in the X_test there are some values outside that range.
This is X_train plot
This is X_test plot
Why is this happening?


You are using
fiton the training set, as should be done.This means that in the formula (X - X_min) / (X_max - X_min), the X_min and X_max refer to the minimum and maximum values in your training set respectively, NOT the test set.
So if your test set has values outside the minimum and maximum values in your training set, those values in the test set will be mapped outside the
feature_rangethat you provided, by simple arithmetic.Shouldn't be anything to worry in your case, the test set scaled values are quite close to the
feature_rangeyou provided.Just make sure the values in your test aren't on a scale totally different from those in your training set. You might consider removing the outliers in your test set to solve the issue.