How to force lm() and glm() functions not to refactor weights for linear regression?

40 Views Asked by At

I just want to run lm() and glm() for linear regression without refactoring weights, i.e., to utilize the weights just as they are specified. How can I do that?

It is known, but not documented, that the functions do refactor weights as they want. https://www.metafor-project.org/doku.php/tips:rma_vs_lm_lme_lmer So, if you specify some weights w and, say, w*100, then you get the same p-values and confidence intervals. The problem is present for linear regression only, for logistic regression the weights are treated correctly.

1

There are 1 best solutions below

0
Thomas Lumley On

First, a correction: lm and glm do not rescale the weights (you can check the code), it's just that the overall mean of the weights doesn't affect the answers for linear regression (as it shouldn't). The mean of the weights gets absorbed into the scale parameter -- you can see that the estimated residual standard error changes.

If you want to get 10 times smaller variances as a result of multiplying all the weights by 10, or similar, one way to do this is to use lm or glm and then divide the standard errors by the mean of the weights. Or, if you are using glm you can specify a different residual standard error to the summary functions (this is the best approach in the problem you linked to, where the issue is that an absolute scale parameter is known and doesn't have to be estimated)

Alternatively, if your weights are integers you can do

expanded_data <- data[rep(1:nrow(data), data$weights), ]

to repeat each observation as many times as its weight.

There aren't many statistical applications for this in linear regression, so it's not built in.

If you have a generalised linear model that has a fixed scale parameter (such as Poisson or binomial), the overall scale of the weights does affect the standard errors (as it should), but if you start estimating the scale parameter (eg with quasibinomial or quasipoisson) it doesn't.