Convert dataframe column names from camel case to snake case

4.1k Views Asked by At

I want to change the column labels of a Pandas DataFrame from

['evaluationId' , 'createdAt', 'scheduleEndDate', 'sharedTo', ...]

to

['EVALUATION_ID', 'CREATED_AT', 'SCHEDULE_END_DATE', 'SHARED_TO',...]

I have a lot of columns with this pattern "aaaBb" and I want to create this pattern "AAA_BB" of renamed columns

I tried something like :

new_columns = [unidecode(x).upper() for x in df.columns]

But I don't have idea how to create a solution.

1

There are 1 best solutions below

0
On

You can use a regex with str.replace to detect the lowercase-UPPERCASE shifts and insert a _, then str.upper:

df.columns = (df.columns
                .str.replace('(?<=[a-z])(?=[A-Z])', '_', regex=True)
                .str.upper()
             )

Before:

  evaluationId createdAt scheduleEndDate sharedTo
0          NaN       NaN             NaN      NaN

After:

  EVALUATION_ID CREATED_AT SCHEDULE_END_DATE SHARED_TO
0           NaN        NaN               NaN       NaN