I wanted to reproduce a stata code with R using feols from the fixest package and predictions from the marginaleffects package. However, while I manage to produce the same coefficients and standard errors using feols compared to Stata's xtreg, predictions calculates larger standard errors than margins in stata.
I calculate the following in stata:
. xtreg happy i.marry c.yrsmarried age loghhinc, fe vce(cluster pid)
------------------------------------------------------------------------------
| Robust
happy | Coefficient std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
1.marry | .1450998 .0170226 8.52 0.000 .1117346 .178465
yrsmarried | -.0178269 .0027528 -6.48 0.000 -.0232225 -.0124314
age | -.0190082 .0014402 -13.20 0.000 -.021831 -.0161854
loghhinc | .2448844 .011323 21.63 0.000 .2226908 .2670781
_cons | 5.921477 .0907651 65.24 0.000 5.743572 6.099381
-------------+----------------------------------------------------------------
sigma_u | 1.2766298
sigma_e | 1.2949898
rho | .49286093 (fraction of variance due to u_i)
------------------------------------------------------------------------------
.
. margins, at(marry=(0 1) yrsmarried=(0 5 10))
Predictive margins Number of obs = 210,287
Model VCE: Robust
Expression: Linear prediction, predict()
1._at: marry = 0
yrsmarried = 0
2._at: marry = 0
yrsmarried = 5
3._at: marry = 0
yrsmarried = 10
4._at: marry = 1
yrsmarried = 0
5._at: marry = 1
yrsmarried = 5
6._at: marry = 1
yrsmarried = 10
------------------------------------------------------------------------------
| Delta-method
| Margin std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
_at |
1 | 7.213823 .0053026 1360.43 0.000 7.20343 7.224216
2 | 7.124688 .0109761 649.11 0.000 7.103176 7.146201
3 | 7.035554 .0243252 289.23 0.000 6.987877 7.08323
4 | 7.358923 .0141712 519.29 0.000 7.331148 7.386698
5 | 7.269788 .0151294 480.51 0.000 7.240135 7.299441
6 | 7.180654 .0252162 284.76 0.000 7.131231 7.230076
------------------------------------------------------------------------------
And here the R results
> FE_CI1 <-feols(happy ~ marry + yrsmarried + loghhinc + age | pid, data = base_ego_c_r_a)
> FE_CI1
OLS estimation, Dep. Var.: happy
Observations: 210,287
Fixed-effects: pid: 26,574
Standard-errors: Clustered (pid)
Estimate Std. Error t value Pr(>|t|)
marry 0.145100 0.017023 8.52394 < 2.2e-16 ***
yrsmarried -0.017827 0.002753 -6.47605 9.5801e-11 ***
loghhinc 0.244884 0.011323 21.62719 < 2.2e-16 ***
age -0.019008 0.001440 -13.19869 < 2.2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
RMSE: 1.21039 Adj. R2: 0.422222
Within R2: 0.009788
>
> predictions(FE_CI1,
+ by = c("marry","yrsmarried"),
+ newdata=datagridcf(yrsmarried = c(0,5,10), marry = c(0,1))
+ )
yrsmarried marry Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
0 0 7.21 0.0929 77.7 <0.001 Inf 7.03 7.40
5 0 7.12 0.0884 80.6 <0.001 Inf 6.95 7.30
10 0 7.04 0.0858 82.0 <0.001 Inf 6.87 7.20
0 1 7.36 0.0917 80.2 <0.001 Inf 7.18 7.54
5 1 7.27 0.0868 83.7 <0.001 Inf 7.10 7.44
10 1 7.18 0.0839 85.6 <0.001 Inf 7.02 7.35
The results in R and Stata are the same if I calculate a simple OLS model with lm in R. Hence, I expect that is the combination of feols and predictions that creates these differences.
----------------------------(added example to reproduce the results)
#load example panel data: here the example dataset from panelr is used
#install.packages("panelr")
data("WageData", package="panelr")
#save data for stata use
#write_dta(data=WageData, path="WageData.dta")
library("tidyverse")
library("fixest")
library("haven")
library("marginaleffects")
FE_CI1<-feols(lwage ~ union + ms + occ + ed + exp | id, data = WageData, panel.id = ~id)
predictions(FE_CI1,
by = c("ms","exp"),
newdata=datagridcf(exp = c(0,10,20), ms = c(0,1)),
vcov = ~id
)
exp ms Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
0 0 4.78 0.0143 335 <0.001 Inf 4.75 4.81
10 0 5.75 0.0213 269 <0.001 Inf 5.71 5.79
20 0 6.72 0.0364 184 <0.001 Inf 6.65 6.79
0 1 4.75 0.0320 149 <0.001 Inf 4.69 4.81
10 1 5.72 0.0352 162 <0.001 Inf 5.65 5.79
20 1 6.68 0.0456 147 <0.001 Inf 6.60 6.77
use "WageData.dta"
xtreg lwage i.union i.ms i.occ c.ed c.exp , fe vce(cluster id)
margins, at(ms=(0 1) exp=(0 10 20))
Predictive margins Number of obs = 4,165
Model VCE: Robust
Expression: Linear prediction, predict()
1._at: ms = 0
exp = 0
2._at: ms = 0
exp = 10
3._at: ms = 0
exp = 20
4._at: ms = 1
exp = 0
5._at: ms = 1
exp = 10
6._at: ms = 1
exp = 20
------------------------------------------------------------------------------
| Delta-method
| Margin std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
_at |
1 | 4.782331 .0405666 117.89 0.000 4.702821 4.86184
2 | 5.749547 .0273942 209.88 0.000 5.695855 5.803239
3 | 6.716763 .021801 308.09 0.000 6.674034 6.759492
4 | 4.750068 .0354733 133.91 0.000 4.680542 4.819595
5 | 5.717285 .0182065 314.03 0.000 5.681601 5.752969
6 | 6.684501 .0049639 1346.62 0.000 6.674772 6.69423
------------------------------------------------------------------------------