I have a sparse dataframe including dates of when inventory is bought or sold like the following:
Date Inventory
2017-01-01 10
2017-01-05 -5
2017-01-07 15
2017-01-09 -20
First step I would like to solve is to to add in the other dates. I know you can use resample but just highlighting this part in case it has an impact on the next more difficult part. As below:
Date Inventory
2017-01-01 10
2017-01-02 NaN
2017-01-03 NaN
2017-01-04 NaN
2017-01-05 -5
2017-01-06 NaN
2017-01-07 15
2017-01-08 NaN
2017-01-09 -20
The final step is to have it fill forward over the NaNs except that once it encounters a new value that get added to the current value of the row above, so that the final dataframe looks like the following:
Date Inventory
2017-01-01 10
2017-01-02 10
2017-01-03 10
2017-01-04 10
2017-01-05 5
2017-01-06 5
2017-01-07 20
2017-01-08 20
2017-01-09 0
2017-01-10 0
I am trying to get a pythonic approach to this and not a loop based approach as that will be very slow.
The example should also work for a table with multiple columns as such:
Date InventoryA InventoryB
2017-01-01 10 NaN
2017-01-02 NaN NaN
2017-01-03 NaN 5
2017-01-04 NaN 5
2017-01-05 -5 NaN
2017-01-06 NaN -10
2017-01-07 15 NaN
2017-01-08 NaN NaN
2017-01-09 -20 NaN
would become:
Date InventoryA InventoryB
2017-01-01 10 0
2017-01-02 10 0
2017-01-03 10 5
2017-01-04 10 10
2017-01-05 5 10
2017-01-06 5 0
2017-01-07 20 0
2017-01-08 20 0
2017-01-09 0 0
2017-01-10 0 0
hope that helps too. I think the current solution will have a problem with the nans as such.
You're simply doing the two steps in the wrong order :)
Edit: you might need to set the
Date
as index first.