ml <- foreign::read.dta("https://stats.idre.ucla.edu/stat/data/hsbdemo.dta")
model.fit <- multinom(ses~female*prog+read, data=ml)
I have the following coefficients:
Call:
multinom(formula = ses ~ female * prog + read, data = ml)
Coefficients:
(Intercept) femalefemale progacademic progvocation read femalefemale:progacademic
middle -1.919435 -0.01626812 1.271990 1.301494 0.04404166 -1.334239
high -4.701536 0.44126239 2.092619 1.171212 0.07779108 -1.815047
femalefemale:progvocation
middle -0.6859539
high -1.5431251
Residual Deviance: 386.2441
AIC: 414.2441
with odds ratio exp(1.271990)=3.567946 and exp(2.092619)=8.106117
However, with emmeans package,
EMM <- emmeans(model.fit, revpairwise~ses|female*prog)
REMM <- regrid(EMM, transform="logit")
contrast(REMM, interaction="revpairwise", type="response", by="female")
I can only get the following output
female = male:
ses_revpairwise prog_revpairwise odds.ratio SE df null t.ratio p.value
middle / low academic / general 4.926 5.380 14 1 1.460 0.1663
high / low academic / general 15.314 16.723 14 1 2.499 0.0255
high / middle academic / general 3.109 3.174 14 1 1.111 0.2853
middle / low vocation / general 7.385 9.044 14 1 1.633 0.1248
high / low vocation / general 4.430 5.496 14 1 1.200 0.2501
high / middle vocation / general 0.600 0.780 14 1 -0.393 0.7002
middle / low vocation / academic 1.499 1.699 14 1 0.357 0.7262
high / low vocation / academic 0.289 0.326 14 1 -1.099 0.2903
high / middle vocation / academic 0.193 0.222 14 1 -1.431 0.1745
female = female:
ses_revpairwise prog_revpairwise odds.ratio SE df null t.ratio p.value
middle / low academic / general 0.885 0.826 14 1 -0.131 0.8977
high / low academic / general 1.475 1.429 14 1 0.401 0.6943
high / middle academic / general 1.667 1.645 14 1 0.517 0.6130
middle / low vocation / general 3.075 3.245 14 1 1.065 0.3051
high / low vocation / general 0.656 0.750 14 1 -0.369 0.7177
high / middle vocation / general 0.213 0.262 14 1 -1.259 0.2287
middle / low vocation / academic 3.475 3.327 14 1 1.301 0.2144
high / low vocation / academic 0.445 0.446 14 1 -0.808 0.4324
high / middle vocation / academic 0.128 0.135 14 1 -1.947 0.0719
Tests are performed on the log odds ratio scale
The ORs are different: 4.926 vs 3.567946 & 15.314 vs 8.106117.
What's going on here?
same output as exp of multinom coefficients
The coefficients of multinomial models are NOT odds ratios. They are on the log scale, not the logit scale.
I will demonstrate using manual calculations with model predictions. First, create a grid with the factor combinations. We also set
readequal to its mean, which is 52.23.Now, let's obtain the predicted probabilities on this grid:
From these, we convert these probabilities to odds:
Finally, we obtain ratios of these odds, and combine with
Gto show the associated factor levels:None of the above calculations used the emmeans package; just model predictions. So now we verify that we obtain the same results using
emmeans(). First, create theemmGridobject for the same factor combinations:Then, obtain Dunnett contrasts of these estimates after re-gridding to the logit scale:
Created on 2023-12-30 with reprex v2.0.2
These results are the same as the
ORmatrix obtained earlier, verifying thatemmeans()is doing things correctly. Note that none of these odds ratios are equal to 3.567946 or 8.106117. That's because the regression coefficients are not logs of odds ratios.I'll demonstrate just one of the cases to show how it works. Consider the first case, where
female = "male",program = "general", andread = 52.53. Here, only the intercept and coefficient ofreadare relevant, as the dummy variables for everything else are zero. We have two rows of coefficients, formiddleandhigh; those forloware all considered zero. We thus have the following linear predictor for the multinomial response:As mentioned, the linear predictions are on the log scale. To back-transform them, we exponentiate them, then divide by the sum so that they sum to 1:
Note that these are the same (with slight baubles from rounding) as the first row of
probthat we obtained frompredict(). Considering how this works, if there were only two levels of the multinomial response, then the linear predictor will be of the formc(0, b), and the probability for the second level will beexp(b) / (1 + exp(b))so thatbis indeed a logit. But that's only true for two multinomial classes.Also, a comment that it is unclear why the OP chose the coefficients of
progacademicand not, say, the intercepts. Nor is it clear why interaction contrasts were selected.