Calling MinMaxScaler differs between same sets

25 Views Asked by B Grau At 26 March 2024 at 21:09

I have scaled my dataset using the MinMaxScaler form sklearn like this:

from sklearn.preprocessing import MinMaxScaler

# create a StandardScaler object
self.scaler = MinMaxScaler(feature_range=(0, 1))

# fit the scaler to the dataset
self.scaler.fit(self.X_org)

# transform dataset using the scaler
self.X_scalled = pd.DataFrame(self.scaler.transform(self.X_org), columns=self.X_org.columns)

return self.X_scalled

However, I am now using the last 10% of the entire dataset for a validation run also scaling the data with the scaler from the training dataset like so:

X_input_val_data_scalled = pd.DataFrame(self.scaler.transform(X_input_val_data), columns=X_input_val_data.columns)

Now my challenge:

In the training X_org set I get a nicely scaled dataset from 0 to 1. In the scaled validation X dataset I get completely wired data ranging from 7.5 to 8...

What am I doing wrong?

Original Q&A

There are 1 best solutions below

Taha Akbari On 26 March 2024 at 23:04

That's is acutally how it is supposed to be a min-max scaler does the scaling as below:

data - min(x) / (max(x) - min(x))

where x is the data where the min-max-scaler is trained on. If data belongs to x then data - min(x) is a positive number smaller than max(x) - min(x) hence the ratio will lie between 0 and 1 but otherwise which is the case in you validation data the ratio doesn't have to be between 0 and 1.

Calling MinMaxScaler differs between same sets

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in SCIKIT-LEARN

Trending Questions

Popular # Hahtags

Popular Questions