Standardizing a set of columns in a pandas dataframe with sklearn

2.5k Views Asked by new_datascientist At 01 June 2021 at 23:17

I have a table with four columns: CustomerID, Recency, Frequency and Revenue.

I need to standardize (scale) the columns Recency, Frequency and Revenue and save the column CustomerID.

I used this code:

from sklearn.preprocessing import normalize, StandardScaler
df.set_index('CustomerID', inplace = True)
standard_scaler = StandardScaler()
df = standard_scaler.fit_transform(df)
df = pd.DataFrame(data = df, columns = ['Recency', 'Frequency','Revenue'])

But the result is a table without the column CustomerID. Is there any way to get a table with the corresponding CustomerID and the scaled columns?

Original Q&A

There are 3 best solutions below

Arturo Sbr On 01 June 2021 at 23:31 BEST ANSWER

fit_transform returns an ndarray with no indices, so you are losing the index you set on df.set_index('CustomerID', inplace = True).

Instead of doing this, you can simply take the subset of columns you need to transform, pass them to StandardScaler, and overwrite the original columns.

# Subset of columns to transform
cols = ['Recency','Frequency','Revenue']

# Overwrite old columns with transformed columns
df[cols] = StandardScaler.fit_transform(df[cols])

This way, you leave CustomerID completely unchanged.

Benjamin Ziepert On 01 June 2022 at 11:11

You can use scale to standardize specific columns:

from sklearn.preprocessing import scale
cols = ['Recency', 'Frequency', 'Revenue']
df[cols] = scale(df[cols])

Salio On 13 February 2023 at 08:15

You can use this metod:

from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
df[:, 3:] = sc.fit_transform(df[:, 1:])

Standardizing a set of columns in a pandas dataframe with sklearn

There are 3 best solutions below

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in SCIKIT-LEARN

Related Questions in STANDARDIZED

Trending Questions

Popular # Hahtags

Popular Questions