I want to remove certain variables from the plot.
# Packages
library(tidymodels)
library(mlbench)
# Data
data("PimaIndiansDiabetes")
dat <- PimaIndiansDiabetes
dat$some_new_group[1:384] <- "group 1"
dat$some_new_group[385:768] <- "group 2"
# Split
set.seed(123)
ind <- initial_split(dat)
dat_train <- training(ind)
dat_test <- testing(ind)
# Recipes
svm_rec <-
recipe(diabetes ~., data = dat_train) %>%
update_role(some_new_group, new_role = "group_var") %>%
step_rm(pressure) %>%
step_YeoJohnson(all_numeric_predictors())
# Model spec
svm_spec <-
svm_rbf() %>%
set_mode("classification") %>%
set_engine("kernlab")
# Workflow
svm_wf <-
workflow() %>%
add_recipe(svm_rec) %>%
add_model(svm_spec)
# Train
svm_trained <-
svm_wf %>%
fit(dat_train)
# Explainer
library(DALEXtra)
svm_exp <- explain_tidymodels(svm_trained,
data = dat %>% select(-diabetes),
y = dat$diabetes %>% as.numeric(),
label = "SVM")
# Variable importance
set.seed(123)
svm_vp <- model_parts(svm_exp, type = "variable_importance")
svm_vp
plot(svm_vp) +
ggtitle("Mean-variable importance over 50 permutations", "")
Notice in the recipes above, I removed variable pressure and make a new categorical variable (some_new_group).
So, I can remove the variable pressure some_new_group from the plot manually like this:
plot(svm_vp %>% filter(variable != c("pressure", "some_new_group"))) +
ggtitle("Mean-variable importance over 50 permutations", "")
But, is it possible to remove the variables when I run explain_tidymodels() or model_parts()?


If you have variables that are not predictors or outcomes handled by your
workflow()(like the variable you remove and your grouping variable), you want to make sure you only pass outcomes and predictors toexplain_tidymodels(). You'll also need to build the explainer with the parsnip model, rather than theworkflow()which is expecting to handle those non-outcome, non-predictor variables:Created on 2022-05-03 by the reprex package (v2.0.1)
If you have these "extra" variables in your workflow that shouldn't be used for explainability, then you'll need to do some extra work and can't rely on the
workflow()alone.