I have a dataset like this
import pandas as pd
df = pd.DataFrame(
{
"id": {0: 1, 1: 1, 2: 1, 3: 2, 4: 2},
"price": {0: 20, 1: 41, 2: 61, 3: 68, 4: 10},
"date_month_start": {
0: "2021-06-12",
1: "2021-11-13",
2: "2022-02-27",
3: "2021-04-14",
4: "2021-07-11",
},
"date_month_end": {
0: "2021-09-14",
1: "2022-01-13",
2: "2022-04-12",
3: "2021-06-18",
4: "2021-10-16",
},
}
)
print(df)
id price date_month_start date_month_end
0 1 20 2021-06-12 2021-09-14
1 1 41 2021-11-13 2022-01-13
2 1 61 2022-02-27 2022-04-12
3 2 68 2021-04-14 2021-06-18
4 2 10 2021-07-11 2021-10-16
But I would like to create a column for first of month that falls between start and end date and repeat rows (except first of month date) if there are more than one first of months falls between start and end date.
For instance if the start date is March 12, 2021 and end date is June 04, 2021, than I would like to have a new column April 1st 2021, May 1st 2021, June 1st 2021. As we have three values for the new column so, I would like to repeat rows by copying other column values except the new one.
The output data should look like:
id price date_month_start date_month_end date_month
0 1 20 2021-06-12 2021-09-14 2021-07-01
1 1 20 2021-06-12 2021-09-14 2022-08-01
2 1 20 2021-06-12 2021-09-14 2022-09-01
3 1 41 2021-11-13 2022-01-13 2021-12-01
4 1 41 2021-11-13 2022-01-13 2022-01-01
5 1 61 2022-02-27 2022-04-12 2022-03-01
6 1 61 2022-02-27 2022-04-12 2022-04-01
7 2 68 2021-04-14 2021-06-18 2021-05-01
8 2 68 2021-04-14 2021-06-18 2021-06-01
9 2 10 2021-07-11 2021-10-16 2021-08-01
10 2 10 2021-07-11 2021-10-16 2021-09-01
11 2 10 2021-07-11 2021-10-16 2021-10-01
I am new in python, anyone has any direction how to do it? I can get first day of month from date column, but it is a whole different thing.
Here is one way to do it: