Python (Pandas) error 'the label [Algeria] is not in the [index]'

11.4k Views Asked by At

I do not understand why this works

df[(df['Gold']>0) & (df['Gold.1']>0)].loc[((df['Gold'] - df['Gold.1'])/(df['Gold'])).abs().idxmax()]

but when I divide by (df['Gold'] + df['Gold.1'] + df['Gold.2']) it stops working giving me error that you can find below.

Interestingly, the following line works

df.loc[((df['Gold'] - df['Gold.1'])/(df['Gold'] + df['Gold.1'] + df['Gold.2'])).abs().idxmax()]

I do not understand what is happening since I just started to learn Python and Pandas. I need to understand the reason why this happens and how to fix it.

ERROR

KeyError: 'the label [Algeria] is not in the [index]'

DataFrame snap enter image description here

1

There are 1 best solutions below

3
On BEST ANSWER

Your problem is boolean indexing:

df[(df['Gold']>0) & (df['Gold.1']>0)]

returns a filtered DataFrame which does not contain the index of max value of Series you calculated with this:

((df['Gold'] - df['Gold.1'])/(df['Gold'] + df['Gold.1'] + df['Gold.2'])).abs().idxmax()

In your data it is Algeria.

So loc logically throws a KeyError.

One possible solution is to assign the new filtered DataFrame to df1 and then get the index corresponding to the max value of Series by using idxmax:

df1 = df[(df['Gold']>0) & (df['Gold.1']>0)]
df2 = df1.loc[((df1['Gold']-df1['Gold.1'])/(df1['Gold']+df1['Gold.1']+df1['Gold.2'])).abs().idxmax()]