Does sklearn.preprocessing StandardScaler converts the data into standard normal distribution?

428 Views Asked by At

StandardScaler() from sklearn.preprocessing claims to make mean=0 and std=1. In reality, mean is a very small number close to 0 and similarly, std is close to 1 but not equal. Does it really convert the data into standard normal distribution as it uses:

z = (x - mu) / sigma
1

There are 1 best solutions below

0
On

You can try it out with your Dataset, having different Columns with values having different Ranges. Scaling using StandardScaler helps in standarization of data, since then calculations for several Distance-Based Algos is better.

Z-Score always converts our particular column into Scaled Data having Mean = 0 and Standard Deviation = 1.

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
model = scaler.fit(X_train)
y_test = model.predict(X_test)

Note: Here, I am not applying StandardScaler on y_train, and we shouldn't apply StandardScaler on y data.