Python panda multi-index dataframes: How to associate specific columns to a specific value from a new index

51 Views Asked by At

Python panda multi-index dataframes: How to associate specific columns to a specific value from a new index

I have meteorological data in a multi-index panda dataframe. For simplicity let’s limit the example to temperatures at different pressure levels (you can imagine these as corresponding to altitudes) and precipitation. Presently the df has 3 indices: Latitude, longitude and time. The temperature values are organised as one column per pressure level. Hence the df looks like:

Idx1 idx2 idx3 Precipitation(mm) T@1000hPa T@900hPa T@800hPa … .. T@10hPa comment
Lat(deg.) Long(deg.) T(hrs)
30 20 0 0.0 25.1 13.7 2.1 .. -30.5
30 21 0 0.1 25.0 13.6 1.8 .. -30.3
30 22 0 0.5 24.6 13.2 1.6 .. -30.4
30 23 0 1.2 22.1 13.0 1.9 .. -30.8
30 24 0 1.7 21.9 12.8 2.1 .. -30.5
31 20 0 0.9 23.3 13.1 2.2 .. -31.2 # next latitude
31 21 0 0.2 23.7 13.1 2.0 .. -30.9
30 20 1 0.2 24.7 13.8 2.3 .. -30.4 # next time step
30 21 1 0.8 23.6 13.7 2.0 .. -30.9

What I would like to achieve is the following:

Idx1 idx2 idx3 idx4 T(degC) comment
Lat(deg.) Long(deg.) T(hrs) pressureLevel(hPa)
30.0 20.0 0 1000 25.1
30.0 21.0 0 1000 25.0
30.0 22.0 0 1000 24.6
30.0 23.0 0 1000 22.1
30.0 24.0 0 1000 21.9
31.0 20.0 0 1000 23.3
31.0 21.0 0 1000 23.7
30 20 1 1000 24.7
30 21 1 1000 23.6
39 23 48 1000 26.2
39 24 48 1000 26.0 # last time step (48hrs)
30.0 20.0 0 900 13.7 start 900hPa pressure level
30.0 21.0 0 900 13.6
30.0 22.0 0 900 13.2
30.0 23.0 0 900 13.0
30.0 24.0 0 900 12.8
31.0 20.0 0 900 13.1 next latitude
31.0 21.0 0 900 13.1
30 20 1 900 13.8 next time step
30 21 1 900 13.7
39 23 48 900 14.0
39 24 48 900 14.2 # last time step (48hrs)
30.0 20.0 0 10 -30.5 start 10hPa pressure level
30.0 21.0 0 10 - 30.3
39 23 48 10 -32.4
39 24 48 10 -32.9 # last time step (48hrs)
Idx1 idx2 idx3 idx4 precipitation(mm) tCloudCover (+many others)
Lat(deg.) Long(deg.) T(hrs) pressureLevel(hPa)
30.0 20.0 0 9999 0.0 0.12
30.0 21.0 0 9999 0.1 0.25
39 23 48 9999 0.2 0.82
39 24 48 9999 0.8 0.84 # last time step (48hrs)

Presently I can for example plot the temperate over a geographic area for a specific pressure level, but I cannot plot a vertical temperature profile at a specific lat/long location let alone a time series of such a profile.

How can I associate all the existing ‘T@1000hPa’ column values with the index for 1000hPa, the ‘T@900hPa’ column with the 900hPa index, the ‘T@10hPa’ column with the 10hPa index and so on? There are other parameters like wind gusts that are only available at lower levels yet not at all pressure levels. And lastly how can I associate the precipitation column (and other quantities) with 9999hPa (meaning that data is not available at different pressure levels, but only as a sum over all levels or at ground)?

One of the first stumble stones is creating the index. An attempt to insert 16 pressure levels as data column with the intention to then convert that data column into an index fails with: “ValueError: Length of values (16) does not match length of index (12400)”.

I tried different (cumbersome) approaches, but could not get it to work thus far, let alone in an elegant way. Or would I have to create a second, independent dataframe to achieve this (speaking of elegance…)? Is there anyone with a good idea who can spare the 2 minutes to give me some pointers? Thanks!

0

There are 0 best solutions below