How to iterate every rows of a dataframe and make calculations based on other table?

275 Views Asked by At

I have a dataframe like this basically:

  • table1: Every market has 2 groups and it is a 2 years data. I want to make new data table based on several conditions.
date market group cost income clicks
A R .... .... .....
A N
B R
B N
C R
C N

I created this table by market and its groups:

table2 = table1.groupby(["market", "group"])[["cost"]].sum().reset_index()
market group cost
A R .....
A N
B R
B N
C R
C N

I want to create new table by iterating every row and create a new column based on: If cost < 1000 then I want to go with market with its own group in table1, then sum income and divide by clicks. If cost > 1000 then I want it to be 0.

market group cost new_column
A R .....
A N
B R
B N
C R
C N

I tried this:

for index, row in table2.iterrows():
    if row['cost'] < 1000:
        table2['new_column'] = table1.loc[(table1.market == row['market'] & (table1.group == row['group'])]['income'].sum() / table1.loc[(table1.market == row['market'] & (table1.group == row['group'])]['clicks'].sum()
    else:
        table1['new_column'] = 0

I just want to iterate every row and go to the first table with values of the second table. How can I do this?

1

There are 1 best solutions below

2
Code Different On

It's faster to sum up all 3 columns cost, income and clicks and then set a condition for new_column:

table2 = table1.groupby(["market", "group"]).sum()
table2["new_column"] = np.where(table2["cost"] > 1000, 0, table2["income"] / table2["clicks"])

You can drop the unneeded columns after.