Consider the code below to fit a generalized additive model including two terms x0 which is linear and x1 which is nonlinear:

library(mgcv)
set.seed(2) ## simulate some data...
dat <- gamSim(1,n=400,dist="normal",scale=2, method="REML")
b <- gam(y~x1+s(x2, k=5),data=dat)

The model b estimates 3 parameters: an intercept, one parametric coefficient for x1, and one smoothing parameter for x2. How can I extract the estimated covariance matrix of these 3 parameters? I have used vcov(b) which gives the following results:

             (Intercept)           x0      s(x1).1      s(x1).2      s(x1).3      s(x1).4
(Intercept)  0.104672470 -0.155791753  0.002356237  0.001136459  0.001611635  0.001522158
x0          -0.155791753  0.322528093 -0.004878003 -0.002352757 -0.003336490 -0.003151250
s(x1).1      0.002356237 -0.004878003  0.178914602  0.047701707  0.078393786  0.165195739
s(x1).2      0.001136459 -0.002352757  0.047701707  0.479869768  0.606310668  0.010704075
s(x1).3      0.001611635 -0.003336490  0.078393786  0.606310668  0.933905535  0.025816649
s(x1).4      0.001522158 -0.003151250  0.165195739  0.010704075  0.025816649  0.184471259

It seems vcov(b) gives the covariance related to each knot of the smooth term s(x1), as the results contain s(x1).1, s(x1).2, s(x1).3, s(x1).4 (That's what I guess). I need the covariance between the estimated smoothing parameter and other parametric coefficients, which should be just one for (Intercept) and just one for x0. Is it available at all?

Edit: I set the method of estimation to REML in the code. I agree that I might have used incorrect phrases to explain my idea as said by Gavin Simpson, and I understand all he said. Yet the idea of calculating the covariance between the parametric coefficients (intercept and coefficient of x1) and them smoothing parameter comes from the method of estimation. If we set it to ML or REML, then there could be the covariance I guess. In this case, the estimated covariance matrix for the log smoothing parameter estimates are provided by sp.vcov. So I think such value could exist similarly for the parametric coefficients and the smoothing parameter.

1

There are 1 best solutions below

1
On

Your statement

The model b estimates 3 parameters: an intercept, one parametric coefficient for x1, and one smoothing parameter for x2.

is incorrect.

The model estimates many more coefficients that these three. Note also that it is confusing to speak of a smoothing parameter for x2 as the model also estimates one of those, but I doubt this is what you mean by that phrase. The smoothing parameter estimated for x2 is the value that controls the wiggliness of the fitted spline. It is also estimated alongside the other coefficients you see, although it isn't typically considered as part of the main model estimated parameters because what you see in the VCOV are actually the variances and covariances of the model coefficients conditional upon this value of the smoothness parameter.

The GAM fitted here is one in which the effect of x2 is represented by a spline basis expansion of x2. For the basis used and the identifiability constraints applied to the basis, this means that the true effect of x2, f(x2), is estimated via a k-1 basis functions. This is a function hat(f(x2)) = \sum \beta_i b_i(x2) estimated by summing up the weighted (by beta_i, the model coefficients for the ith basis function, b) basis functions evaluated at the observed values of x2 (b_i(x2)).

Hence once the basis is chosen and once we have a smoothness parameter (my version, the one controlling the wiggliness), this model is simply a GLM with x1 and the 4 basis functions evaluated at x2. Hence it is parametric and there isn't a single element in the VCOV that relates to the smooth f(x2) - the model just doesn't work that way.