I have a dataframe and these are the first 5 index, there are several rows with different datapoint for a date and then it goes to the next day
DatetimeIndex(['2014-01-01', '2014-01-01', '2014-01-01', '2014-01-01',
'2014-01-01'],
dtype='datetime64[ns]', name='DayStartedOn', freq=None)
and this is the current column dtypes
country object
type object
name object
injection float64
withdrawal float64
cy_month period[M]
I wish to add a column with calendar year month, and 2 columns with different fiscal years and months. better to separate year and month in different columns like: calendar year, calendar month, fiscal year, fiscal month. The objective is to keep these column values when I perform regroup or resample with other columns
I achieved above cy_month by
df['cy_month']=df.index.to_period('M')
even I don't feel very comfortable with this, as I want the period, not the monthend
I tried to add these 2 columns for calendar year:
pd.Period(df_storage_clean.index.year, freq='A-DEC')
for another fiscal year:
pd.Period(df_storage_clean.index.year, freq='A-SEP')
but had Traceback:
ValueError: Value must be Period, string, integer, or datetime
So I started to NOT using pandas by loop row by row and add to a list,
lst_period_cy=[]
for y in lst_cy:
period_cy=pd.Period(y, freq='A-DEC')
lst_period_cy.append(period_cy)
then convert the list to a Series or df and add it back to the df
but I suppose it's not efficient (150k rows data) so haven't continued
Just in case you haven't found a solution yet ...
You could do the following:
My assumption is that, as in your example, the index is named
DayStartedOn
. If that's not the case then the code has to be adjusted accordingly.