I'm really scratching my head here. I really don't understand what is going on. This is a MWE, but the actual code and purpose is more complex then this. So the code:
library(dplyr)
ds <- mutate(iris, Species = as.numeric(Species == 'setosa'))
ds %>%
do_(
.dots = lazyeval::interp(
"broom::tidy(stats::glm(form, data = ., family = distr))",
form = Species ~ Sepal.Length,
distr = binomial()
)
)
Which returns: Error in family$linkfun(mustart) : object 'C_logit_link' not found
... but this code bit works fine:
ds %>%
do_(
.dots = lazyeval::interp(
"broom::tidy(stats::glm(form, data = ., family = distr))",
form = Sepal.Width ~ Sepal.Length,
distr = gaussian()
)
)
The only difference between the two is the family distribution used (gaussian vs binomial) and the variable used.
So the question: why is it that lazyeval can't find C_logit_link
?
When you call
interp(x, *)
, it evaluates the arguments that are to be interpolated intox
. In the case ofbinomial()
, the result is a structure that represents the binomial distribution in a GLM.Buried inside that structure is a function that calls out to compiled C code, via the object
C_logit_link
. This is an unexported object in the stats package. Normally everything works fine, because the environment of that function is the stats namespace and so it's able to findC_logit_link
.The problem here is that the object you're interpolating is a string, which means that everything interpolated into it is also coerced into a string. That loses the environment information necessary to find
C_logit_link
.The solution is to interp a formula instead: