R glm(): confusing confidence intervals for binomial regression

1.5k Views Asked by At

I have been using glm() in R to compute confidence intervals for the logit probability parameter governing a single binomial draw.

P <- 20 # Number of successes
D <- 1  # Number of failures
model1 <- glm(matrix(c(P,D), nrow=1) ~ 1, family="binomial") # Successes modeled as binomial draw from successes+failures
summary(model1)
confint(model1)  # Confidence interval on binomial parameter p

I notice that if either the number of successes or failures is zero (P=0 or D=0), the confidence intervals returned are meaningless.

I then computed my own confidence intervals by numerically integrating the normalized binomial likelihood:

binomial_fun <- function(p) {choose(N, K)*(p^K)*(1-p)^(N-K)}  # A binomial function
logit_fun <- function(p) {log(p/(1-p))}                       # A logit function
f_upper <- function(a) {integrate(binomial_fun, 0, a )$value/integrate(binomial_fun, 0, 1 )$value - .975}
f_lower <- function(a) {integrate(binomial_fun, a, 1 )$value/integrate(binomial_fun, 0, 1 )$value - .975}
# These functions will take value zero when a corresponds to the 97.5% and 2.5% points
# of the normalized binomial likelihood given N and K.

N <- P+D     # N is the number of trials
K <- P       # K is the number of successes
uci2 <- logit_fun(uniroot( f_upper, c(0, 1) )$root) # Look for a solution in [0,1]
lci2 <- logit_fun(uniroot( f_lower, c(0, 1) )$root) # Look for a solution in [0,1]

This gave meaningful confidence intervals where P=0 or D=0. However, it gives different confidence intervals from glm() + confint() even when neither P nor D is zero.

confint(model1)
c(lci2, uci2)

Compared to glm() + confint(), both the lci and uci that I compute tend to be closer to zero.

How is confint() computing the interval? I'm sure there is a good reason for doing it this way in terms of performance on more complicated glms, but it seems like a weird result in this simple case.

0

There are 0 best solutions below