How do I use Interaction variables in a multinomial probit model?

73 Views Asked by At

I'm doing an econometrics project with stata where I'm looking at the impact of individual's BMI (body mass index and Height (called height 5 as I divided the height by 5 so I could interpret by saying "being 5 inches taller increases your probability of being in earnings category Y by X% points" on their earnings. The data set didn't have a continuous earnings variable, only categorical so for example, category 3 is >$30000 and <$40000. I have found that height increases the likelihood of someone being in the highest income category, however I want to know if this relationship is stronger for men, through using an interaction variable, however I'm not sure how to add in the interaction variable.

Currently my oprobit looks like this

oprobit earncat height5 BMI age i.sex i.mrd i.cworker i.race

I'm unsure what to add to the end to create an interaction, and furthermore, how to interpret it.

Ideally I would be able to say something like "being taller as a man is more advantageous than being a woman in terms of likelihood to land in the higher earnings category" or something of the sort.

Once I can do this, I can add a lot of interactions such as looking at if the relationship between height and earnings varies across different occupations.

1

There are 1 best solutions below

1
On

It's been a while since I studied ordered and multinomial probits so I may be missing something, but I think that programmatically at least this is easily handled by Stata's factor variable notation (see help factor variables).

In the example below, I create an interaction between the categorical variable foreign (analogous to your sex) and the continuous variable mpg (analogous to your height5).

webuse fullauto
oprobit rep77 length i.foreign##c.mpg
Iteration 0:   log likelihood = -89.895098  
Iteration 1:   log likelihood = -78.084293  
Iteration 2:   log likelihood = -77.996635  
Iteration 3:   log likelihood = -77.996562  
Iteration 4:   log likelihood = -77.996562  

Ordered probit regression                               Number of obs =     66
                                                        LR chi2(4)    =  23.80
                                                        Prob > chi2   = 0.0001
Log likelihood = -77.996562                             Pseudo R2     = 0.1324

-------------------------------------------------------------------------------
        rep77 | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
--------------+----------------------------------------------------------------
       length |   .0458248   .0135235     3.39   0.001     .0193193    .0723303
              |
      foreign |
  1. Foreign  |   1.419149   1.385244     1.02   0.306    -1.295879    4.134177
          mpg |   .1236864   .0490655     2.52   0.012     .0275198     .219853
              |
foreign#c.mpg |
  1. Foreign  |   .0116269   .0536813     0.22   0.829    -.0935866    .1168404
--------------+----------------------------------------------------------------
        /cut1 |   9.820641    3.44782                      3.063039    16.57824
        /cut2 |   10.87018   3.478686                      4.052079    17.68828
        /cut3 |   12.20474   3.522819                      5.300138    19.10933
        /cut4 |   13.64437    3.57088                      6.645572    20.64317
-------------------------------------------------------------------------------

Do you then need to use the margins command to get interpretable marginal effects? See help margins or these slides from Richard Williams. As I say it's been a while since I looked at these models so I don't know if this is necessary or correct in your context but I'd think that one way to get these marginal effects, if you need them, would be to use margins, dydx(mpg) at(foreign=(0 1)) after estimating the probit, so that you can get the marginal effect for each category separately by domestic and foreign. Another way to do this, which gives different results, is margins, dydx(mpg) over(foreign). As to which method is better suited to your context, you can find some discussion here.