How do you specify variable_splits in DALEX::model_profile()?
I'm trying to get accumulate local effects for a random forest model with the above function. When doing so I get a warning...
In FUN(X[[i]], ...) : Variable: < MAP > has more than 201 unique values and all of them will be used as variable splits in calculating variable profiles. Use the
variable_splitsparameter to mannualy change this behaviour. If you believe this warning to be a false positive, raise issue at https://github.com/ModelOriented/ingredients/issues.
To my knowledge the DALEX package does not supply information on how to specify this, but I think it should be a list type output indicating splits for each variable, as you would get from ingredients::calculate_variable_split. However, when trying to run this function I get a curious error...not sure what's going on there
Error: 'calculate_variable_split' is not an exported object from 'namespace:ingredients'
Something to play with that reproduces the warning
library(tidyverse)
library(ranger)
library(DALEX)
data("iris")
glimpse(iris)
data<-rbind(iris,iris)%>%#double it to get >201 obs
mutate(Species=as.factor(ifelse(Species=="setosa",1,0)))%>%#factor species 0,1
mutate(toomany=sample(1:1000,300,replace=FALSE))#get variable with >201 unique obs
mod<-ranger(Species~.,data,keep.inbag =TRUE,importance='impurity',seed=4,probability=TRUE)#random forest probability model
ex<-DALEX::explain(model=mod,
data=data[,-5],
y=as.numeric(as.character(data$Species)),
label="Random Forest")#explainer
#works but see warning about variable_splits
ale<-model_profile(explainer=ex,type="accumulated")#get accumulated local effects
#doesn't work
#needs a list I think...but not sure exactly what it's looking for
ale<-model_profile(explainer=ex,type="accumulated",variable_splits=20)
It is looking for a list of split points for each covariates. So something like this would work.