Compare two dataframes in pandas

121 Views Asked by At

I am a beginner. I have two dataframes in pandas, I would like to identify what are the changes from the original to the new dataframe.

  • Rows: products
  • Columns: demand for future periods

dataframe differences could be: new rows, deleted rows, and changed demand.

Ideally I would make a heatmap (showing changes) ... but I'm stuck - unsure if I have to iterate over or not ...

A record in a dataframe is:

ProductId | demand_Month1 | demand_Month2 | demand_Month3 ... MonthX

This data is monthly updated. I would like to generate the following table

productID | old - new (demand) ... for each month.

Dataframes contain same months demand data.

1

There are 1 best solutions below

0
On
def dataframe_difference(df1: DataFrame, df2: DataFrame, which=None):
    """Find rows which are different between two DataFrames."""
    comparison_df = df1.merge(
        df2,
        indicator=True,
        how='outer'
    )
    if which is None:
        diff_df = comparison_df[comparison_df['_merge'] != 'both']
    else:
        diff_df = comparison_df[comparison_df['_merge'] == which]
    diff_df.to_csv('data/diff.csv')
    return diff_df

Look at this for a start