I want to use bs function for numerical variables in my dataset when fitting a logistic regression model.
df <- data.frame(a = c(0,1), b = c(0,1), d = c(0,1), e = c(0,1),
f= c("m","f"), output = c(0,1))
library(splines)
model <- glm(output~ bs(a, df=2)+ bs(b, df=2)+ bs(d, df=2)+ bs(e, df=2)+
factor(f) ,
data = df,
family = "binomial")
In my actual dataset, I need to apply bs() to way more columns than this example. Is there a way I can do this without writing all the terms?
We can use some string manipulation with
sprintf, together withreformulate:If you want to use a different
dfanddegreefor each spline, it is also straightforward (note thatdfcan not be smaller thandegree).Yes. This is good for safety. And of course, an automatic way if OP wants to include all numerical variables other than "output" as predictors.