I got two dataframes: df_a and df_b. both dataframes indexes are datetime in the following format and are already sorted in the right order: 2022-02-28 07:00:00. df_a['Kies'] has bigger values than df_b['Produzierte Menge'].
I would like to write a function that accumulates the values of df_b['Produzierte Menge'], until it is approximately as big as df_a['Kies']. As soon as it reaches the value approximately, the next value of df_a['Kies'] should be reached by accumulating the values of df_b['Produzierte Menge'], and so on.
Following conditions need to be considered:
- the values of data_produktionsstatistik_produktion['Produzierte Menge'] can only be accumulated in the right time order
- once a value of data_produktionsstatistik_produktion['Produzierte Menge'] has been included in the accumulation, it can not be used anymore. For example: To reach the first value of df_a['Kies'] the first five rows have been accumulated. To reach the second value of df_a['Kies'] the accumulatioin should start at the sixth row -only compare values that happen at the same day! no accumulation of multiple days
THIS IS MY CODE:
def match_daily_values(df_a, df_b):
result_rows = []
for date, target_kies in df_a.iterrows():
target_kies_value = target_kies['Kies']
cumulative_produzierte_menge = 0
for prod_date, produzierte_menge in df_b.iterrows():
if prod_date.date() != date.date():
continue
cumulative_produzierte_menge += produzierte_menge['Produzierte Menge']
result_rows.append({'Date': date, 'Cumulative_Produzierte_Menge': cumulative_produzierte_menge, 'Target_Kies': target_kies_value})
if cumulative_produzierte_menge >= target_kies_value:
break
result_df = pd.DataFrame(result_rows)
return result_df
result_df = match_daily_values(df_a, df_b)
MY PROBLEM:
When the value is approximately reached of df_a['Kies'], the accumulation start always at the first value of the respective day, and not with the following value of the last accumulated value.