I have seperate csv data for each weather station, each climate scenario, each RCP and each variable. After appending the data an example of it looks like this:
Example:
import pandas as pd
df_before = pd.DataFrame({'canton': ['AG', 'AG', 'AG', 'AG', 'AG', 'AG', 'AG', 'AG', 'AG', 'BE', 'BE', 'BE'],
'time': ['1981-01-01', '1981-01-01', '1981-01-01', '1981-01-01', '1981-01-01',
'1981-01-01', '1981-01-01', '1981-01-01', '1981-01-02', '1981-01-02', '1981-01-02', '1981-01-02'],
'value': [3, 5, 1, 2, 4, 5, 6, 7, 2, 1, 2, 3],
'stn': ['BUS', 'BUS', 'BUS', 'BUS', 'BUS', 'BUS', 'BUS', 'BUS', 'BUS', 'BER', 'BER', 'BER'],
'RCPs': ['RCP26', 'RCP26', 'RCP26', 'RCP45', 'RCP45', 'RCP45', 'RCP85', 'RCP85', 'RCP85', 'RCP26', 'RCP26', 'RCP26'],
'var': ['tas', 'tasmin','tasmax', 'tas', 'tasmin', 'tasmax', 'tas', 'tasmin', 'tasmax', 'tas', 'tasmin', 'tasmax', ]
})
However, I would want to split the value column on to the var (tas, tasmin, tasmax, pr) column, without loosing information from the RCPs and the station:
df_after = pd.DataFrame({'canton': ['AG', 'AG', 'AG', 'BE'],
'time': ['1981-01-01', '1981-01-01', '1981-01-01', '1981-01-02'],
'stn': ['BUS', 'BUS', 'BUS', 'BER'],
'RCPs': ['RCP26', 'RCP45', 'RCP85', 'RCP26'],
'tas': ['3', '2', '6', '1'],
'tasmin': ['5', '4', '7', '2'],
'tasmax': ['1', '5', '2', '3']
})
Such that it looks like this: df_after
Like I mentioned in the comments, you can use the
pivot()
method:The
rename_axis()
method removes the name of the column index after pivoting andreset_index()
converts the multi-index created by pivot to dataframe columns.