All levels of variable predicted when reference level has no observations in fixed effect regression

63 Views Asked by At

I am using the marginaleffects package to calculate the conditional marginal effects of a particular fixed effect regression using the fixest package.

My code looks like the following:

library(marginaleffects)
library(tidyverse)
library(fixest)


reg <- feols(data = data, DV ~ var1:(var3 + control) + var3 + var1:var2 + var2 + control | year , vcov="cluster")


plot_cme(reg,variables = c("var3"), condition = list("var1"), draw = FALSE) %>%
  mutate(type = case_when(contrast == "type2 - type1" ~ "type2",
                                 contrast == "type3 - type1" ~ "type3",
                                 contrast == "type4 - type1" ~ "type4",
                                 contrast == "type5 - type1" ~ "type5")) %>% ggplot(aes(x=var1,y=estimate,ymin=conf.low,ymax=conf.high)) + facet_wrap(~type) + geom_point() + geom_errorbar() + theme_light() 

The variable var1 is intentionally interacted with var3 and var2 without the uninteracted effect because it is dependent on var3, and there would thus be perfect collinearity.

All this code works, but I know that this version of the dataset data does not have any observations where the reference level for the categorical variable var3 (ref = type1) is present. All observations of var3 are "type2", "type3", "type4" or "type5". Yet I can plot CMEs and get summary() estimates for all 4 other "types" of var3, both interacted and non-interacted. I do not understand how this is possible. How can you calculate these effects with no reference level to compare them against?

Is this cased by the use of fixed effects for year, since that eliminates the intercept? If that is the case, then how do I interpret the results from plot_cme()? What are the marginal means estimates with respect to, if the reference category is no longer present?

If I look at the plot_cme output setting draw=false like so:

plot_cme(reg,variables = c("var3"), condition = list("var1"), draw = FALSE)

It lists the column "var3" as "type2" for all the rows. This would seem to indicate that the reference level has been changed to "type2". Yet I still have a contrast for "type2 - type1". How is this possible? And how can I interpret the marginal effects assuming this model is correct?

0

There are 0 best solutions below