How to remove numbers from all column names / headers in a dataframe

3.3k Views Asked by At

Hi So I have a data frame with column names that end in '2018'

I need to remove the years from these column names and am having some trouble. I also need to strip leading and trailing spaces from these column names as well.

I've already tried the following:

df.columns.str.replace('\d+',"") #to try and remove the numbers from the column names

df.columns = df.columns.str.strip('') #to try and get rid of the spaces

These do nothing to the dataframe.

I expect the column names to go from " Stock 2018" to "Stock"

but this isn't happening. Thank you for the help!

3

There are 3 best solutions below

0
Karn Kumar On BEST ANSWER

You can try using regex as well..

Example DataFrame:

>>> df = pd.DataFrame.from_dict({'Name04': ['Chris', 'Joe', 'Karn', 'Alina'], 'Age04': [14, 16, 18, 21], 'Weight04': [15, 21, 37, 45]})                                 

>>> df
   Age04 Name04  Weight04
0     14  Chris        15
1     16    Joe        21
2     18   Karn        37
3     21  Alina        45

Result using regex:

>>> df.columns = df.columns.str.replace(r'\d+', '')
>>> df
   Age   Name  Weight
0   14  Chris      15
1   16    Joe      21
2   18   Karn      37
3   21  Alina      45
2
Alexis Drakopoulos On

You are not using the proper way of renaming columns in pandas: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.rename.html

From the documentation it seems like you can simply do the following:

df = df.rename(str.replace('\d+',""), axis='columns')

let me know if that works for you.

0
Ricky Kim On

You just need to assign to df.columns for removing numbers, and also don't pass anything to str.strip() to remove leading/trailing whitespace characters.

df.columns=df.columns.str.replace('\d+','').str.strip()