Is It necessary to normalize Data to generate Natural Splines in Recipes

368 Views Asked by At

I fitted a model using natural splines and I am not sure if there are any advantages of using BoxCox and center and scale on the predictors. Does the step natural spline perform the transformations? Are there advantages in normalizing the predictors before natural splines?

library(tidyverse)
library(tidymodels)

car <- read_csv('vw.csv')
str(car)
## ---- Split data -----------------------

split <- initial_split(car, prop = 0.80, strata = 'price')
car_train <- training(split)
car_test <- testing(split)


## ---- Recipe --------------------------

rec <- recipe(price ~ . , data = car_train) %>% 
  step_mutate(
    tax = log(tax + 1)
  ) %>% 
  step_ns(mpg, mileage, engineSize, year, deg_free = 3) %>% 
  step_dummy(all_nominal()) 


## -- Model ---------------------------------

model_lasso <- linear_reg(mode = 'regression', penalty = tune(), mixture = tune()) %>% 
               set_engine('glmnet')


## --- Workflow -----------------------------

work01 <- workflow() %>% 
  add_recipe(rec) %>% 
  add_model(model_lasso)

## --- Foldes -------------------------------

folds <- vfold_cv(car_train, v = 10, strata = 'price')

## --- tune ---------------------------------

grid01 <- grid_latin_hypercube(parameters(model_lasso), size = 10)

tune01 <- tune_grid(
  work01,
  resamples = folds,
  grid = grid01,
  metrics = metric_set(rmse, rsq)
)

## --- Show_best ---------------------------------

show_best(tune01)
best01 <- select_best(tune01)

## --- Test --------------------------------------

test01 <- work01 %>% 
  finalize_workflow(best01) %>%
  last_fit(split)

test01 %>% collect_metrics()

´´´
0

There are 0 best solutions below