Incorporate seasonality from the day of the year into randomforest model using sine and cosine transformations

45 Views Asked by At

I'm currently trying to incorporate day of the year (DOY) into my randomforest model in R using sine and cosine transformations. The reason I'm not simply using DOY is because I'd like the model to understand December 31st and January 1st are similar, which I don't believe will be properly conveyed with values of 1 and 365. I can mimic seasonality using sine or cosine to some extent, but run into the problem of multiple y values for sin(DOY) = y (i.e. a value of zero occurs on two dates if I use only a sine or cosine transformation). This can lead to a date in summer and winter receiving the same sin(DOY) despite being very different. Is there a way to include a sine and cosine pair as a single feature (i.e. (sin(DOY), cos(DOY))? Or perhaps there's another way to include the DOY into the model?

My current code is as follows:

dfSensor$DOYSin <- sin((dfSensor$DOY-173) * (2*pi)/365.25)

Where day 173 corresponds to June 22nd. The produces a value of +1 around September 21st and -1 around March 21st, but June 22nd and December 22nd are both around a value of 0. This issue will occur no matter what kind of shift I use for the day. However, I think adding a cosine column to my dataframe and combining the sine and cosine transformation into one feature might help the issue.

0

There are 0 best solutions below