How to evaluate an expression in a pandas df that finds the greater of two variables?

62 Views Asked by At

I want to do compute a value that is the maximum of two columns, as follows:

df = pd.DataFrame( {'a':[1,3,5],
                    'b':[6,4,2] } )

df['c'] = df.eval('maximum(a,b)')

I am getting 'ValueError: "maximum" is not a supported function'. This is true even when I use engine='python'. Surprising because maximum is a ufunc. The exact computation required must be provided as an external string. How should I proceed?

eval('maximum(df.a, df.b)') 

does work fine, but I'd rather not do this for readability reasons.

2

There are 2 best solutions below

1
mozway On BEST ANSWER

One possibility would be to use numpy.maximum as an external function:

from numpy import maximum

df = pd.DataFrame( {'a':[1,3,5],
                    'b':[6,4,2] } )

df['c'] = df.eval('@maximum(a,b)')

Output:

   a  b  c
0  1  6  6
1  3  4  4
2  5  2  5
0
j stevenage On
df["C"] = df[["A", "B"]].max(axis=1)

The intuition behind this is that with axis = 1, it selects the max across the rows horizontally vice versa the opposite is true if you want to select between columns