Resample sales data by day - now unable to use polyfit - Python

101 Views Asked by At

I have a dataframe of data with sales like this:-

sales_df

Date    Sales

01/04/2020 00:03    1

01/04/2020 02:26    4

01/05/2020 02:28    3

01/05/2020 05:09    5

01/06/2020 05:16    6

01/06/2020 05:17    7

01/07/2020 05:18    3

which looks like this on sales_df.info()

 0   Date   datetime64[ns]
 1   Sales  float64  

 

and I can perform the below and get a result

line_coef = np.polyfit(sales_df.index,sales_df['Sales'],1)

print(line_coef)

I want to do the same, but aggregated by day, so I've resampled the data like this

sales_day_df = sales_df.resample('D',on='Date').agg({'Sales':'sum'})

which results in a dataframe like this:-

sales_by_day_df

Date    Sales

01/04/2020  5

01/05/2020  8

01/06/2020  13

01/07/2020  3

But when I try and perform the same

line_coef = np.polyfit(sales_by_day_df.index,sales_by_day_df['Sales'],1)

print(line_coef)

I get an error

UFuncTypeError: ufunc 'add' cannot use operands with types dtype('<M8[ns]') and dtype('float64')

I notice that I only have the one column now in my dataframe with a DatetimeIndex, is this the cause? Do I need to create a new column for the date when I resample the data?

sales_by_day_df.info()
DatetimeIndex: 30 entries, 2020-04-01 to 2020-04-30
Freq: D
|Data columns (total 1 columns):|
 #   Column           Dtype  
---  ------            -----  
 0   Sales           float64
1

There are 1 best solutions below

1
On

It's because when you resample, your index has to be datetime. in your code, you set the Date column as the index.

your first try work because the index still numbers (not DateTime as mentioned in the error message)

your second try didn't work because the index now is datetime

my advice is, try to reset the index then try to execute your polyfit again

you can reset your index with DataFrame.reset_index()