I am trying to find a way to replicate a plot similar to this, where the splines and the basis functions that make up those splines are both plotted in the same window.
I have successfully done both separately below:
#### Load Libraries ####
library(mgcv)
library(tidyverse)
library(gratia)
library(gamair)
library(ggpubr)
#### Set Theme ####
theme_set(theme_bw())
#### Add Data ####
data("wesdr")
wes <- as_tibble(wesdr)
wes
#### Fit GAM ####
fit <- gam(
ret ~ s(dur, bs = "cr"),
method = "REML",
family = binomial,
data = wes
)
#### Plot Basis Functions ####
b <- draw(basis(fit))
s <- draw(fit)
ggarrange(b,s)
However, I'm not sure how to mash them together. Simply throwing them on top of each other obviously doesn't work:
#### Attempt at Plotting BF and Spline ####
wes %>%
ggplot(aes(x=dur,
y=ret))+
stat_smooth(method = "gam",
method.args = list(family = binomial),
formula = y ~ s(x, bs = "cr"),
se = T,
color = "steelblue")+
geom_line(data = basis(fit),
aes(x=dur,
y=value,
color=bf))
How can one achieve this?
The figure isn't really showing or using any response data, only values of the spline covariate and it doesn't really need that unless you want pretty, smooth, basis functions. It's a different matter if you want to draw the basis for an estimated spline. Assuming you might want both (the first for teaching or explaining how splines work, the second to explain it in terms of a specific fit), below I show how to generate both kinds of figure.
option 1, using a basis and user-specified weights
This produces:
option 2, using an estimated model
If you want to do this for an actual model fit, you could follow the above example, but you would need to include the identifiability constraints in the spline (see
?basis
) and extract the correct weights for the basis functions from the vector of model coefficients returned bycoef(m)
.{gratia}'s
basis()
has a method for fitted models, which automates this process.This produces
To get the final version you wanted (with a credible interval), evaluate the spline at the same covariate values using
smooth_estimates()
instead of manually summing the basis functions:which produces
Where were you going wrong?
I think your approach didn't work for a couple of reasons.
draw()
methods don't return the underlying data. By design (because of howggplot()
works) they return ggplot objects. It is better to use functions (like you did withbasis()
) to get the outputs you want, and then plot them yourself usingggplot()
, as I showed with the last example in Option 2.Don't ever use
geom_smooth()
orstat_smooth()
to fit a GAM. It's easy to make mistakes; here you forget to ask formethod = "REML"
, which you need to do throughmethod_args = list(method = "REML")
in thestat_smooth()
call.You approach isn't too wrong though; notice that many of the basis functions on the left of the figure are negative so they pull the fitted spline down, even though some of the other basis functions peak above the fitted spline.
One final comment; use the {patchwork} package to arrange objects returns by
draw()
as you'll get better alignment.draw.gam()
and many otherdraw()
methods in {gratia} already return patchworks, not simple ggplot objects, so you'll get the best compatibility if you use {patchwork}'s layout tools.