I have a sample of more than 50 million observations. I estimate the following model in R:
model1 <- feglm(rejection~ variable1+ variable1^2 + variable2+ variable3+ variable4 | city_fixed_effects + year_fixed_effects, family=binomial(link="logit"), data=database)
Based on the estimates from model1, I calculate the marginal effects:
mfx2 <- marginaleffects(model1)
summary(mfx2)
This line of code also calculates the marginal effects of each fixed effects which slows down R. I only need to calculate the average marginal effects of variables 1, 2, and 3. If I separately, calculate the marginal effects by using mfx2 <- marginaleffects(model1, variables = "variable1") then it does not show the standard error and the p-value of the average marginal effects.
Any solution for this issue?
Both the
fixest
and themarginaleffects
packages have made recent changes to improve interoperability. The next official CRAN releases will be able to do this, but as of 2021-12-08 you can use the development versions. Install:I recommend converting your fixed effects variables to factors before fitting your models:
Then, you can use
marginaleffects
andsummary
to compute average marginal effects:Note that computing average marginal effects requires calculating a distinct marginal effect for every single row of your dataset. This can be computationally expensive when your data includes millions of observations.
Instead, you can compute marginal effects for specific values of the regressors using the
newdata
argument and thetypical
function. Please refer to themarginaleffects
documentation for details on those: