I am trying to create my first Time Series Machine learning model using recipes. I am following a guide I found online and trying to recreate the steps with my own data for practice. I am running into an error when using the step_dummy() function and running prep.
Error in step_dummy():
Caused by error in prep():
✖ All columns selected for the step should be factor or ordered.
• 189 double variables found
link to guide for reference: https://www.r-bloggers.com/2020/03/time-series-machine-learning-and-feature-engineering-in-r/
I tried changing the format of the column names used in interaction terms and dummy variable creation using the as.factor/as.order command but I still get the error.
code:
recipe_timeseries <- recipe(Frequency ~ ., data = Data_Test1_ts_train_tbl) %>%
step_timeseries_signature(Date_Arrival)
bake(prep(recipe_timeseries), new_data = Data_Test1_ts_train_tbl)
recipe_final <- recipe_timeseries %>%
step_rm(Date_Arrival) %>%
step_rm(contains("iso"),
contains("second"), contains("minute"), contains("hour"),
contains("am.pm"), contains("xts")) %>%
step_normalize(contains("index.num"), Date_Arrival_year) %>%
step_interact(~ Date_Arrival_month.lbl * Date_Arrival_day) %>%
step_interact(~ Date_Arrival_month.lbl * Date_Arrival_mweek) %>%
step_interact(~ Date_Arrival_month.lbl * Date_Arrival_wday.lbl * Date_Arrival_yday) %>%
step_dummy(contains("lbl"), one_hot = TRUE)
bake(prep(recipe_final), new_data = Data_Test1_ts_train_tbl) #error triggers when I run this line