Catching ZeroDivisionError Yet Skips All Iterations in a Dataframe Column

59 Views Asked by At

In order to avoid dividing by zero, the following logic is aimed to calculate a percentage change and skips the values where the prior month has 0 in values. However, the following would yield all percentage change to null even though there are valid non-NaN non-zero numbers in the referred columns.

df_ex['metric_diff'] = (df_ex['metric_lastmonth'] - df_ex['metric_thismonth'])
try:
    df_ex['metric_pctdiff'] = (df_ex['metric_lastmonth'] - df_ex['metric_thismonth'])/df_ex['metric_thismonth']
except ZeroDivisionError:
    df_ex['metric_pctdiff'] = np.nan

print(len(df_ex[df_ex['metric_diff'].notna()]))

521

print(len(df_ex[df_ex['metric_pctdiff'].notna()]))

0

The outputs indicate that there are nominal difference with non-NaN values, yet the percentage difference comparison yields all NaN values when compared the same two columns. Is there a logic error I made? What went wrong here?

Additional details: here is a sample of the Dataframe. enter image description here

and in this example, there are two rows where we have non-null values from last month. enter image description here

and... the desired outputs should be like in the following: enter image description here

2

There are 2 best solutions below

1
Saxtheowl On

Pandas will return inf when you divide by zero, and NaN when you divide zero by zero, that's why your ZeroDivisionError is not being caught and you got a return value NaN.

We could handle the case you want manually with the code below, it will calculate the percentage difference for non-zero 'metric_thismonth' and place NaN for zero 'metric_thismonth'

edit: lets try with a lambda function inside the apply() function

import pandas as pd
import numpy as np

df_ex['metric_diff'] = df_ex['metric_lastmonth'] - df_ex['metric_thismonth']

df_ex['metric_pctdiff'] = df_ex.apply(lambda row: (row['metric_diff'] / row['metric_lastmonth']) 
                                       if row['metric_lastmonth'] != 0 
                                       else np.nan, axis=1)
0
Pythonlearner On

Using .fillna(0) on the both columns resolves the problem (as described on the solution to another question):

df_ex['metric_pctdiff'] = df_ex['metric_diff'].fillna(0) / df_ex['metric_lastmonth'].fillna(0)