R Sargan Test Results Contrast with Manual Approach

Question

R Sargan Test Results Contrast with Manual Approach

72 Views Asked by Ryan Walter At 28 July 2025 at 09:21

I have been trying to manually reproduce the result given by the Sargan test in R, sadly to no avail.

When I run ivreg() and then output the Sargan test statistic:

eitc <- read.dta13('education_earnings_v2.dta') 
eitc$ln.wage <- log(eitc$wage) 

TSLS <- ivreg(data = eitc, ln.wage ~ educ + exper + south + nonwhite 
                           | nearc4 + nearc2 + exper + south + nonwhite)

summary(TSLS, diagnostics=TRUE)

I get a Sargan statistic of 1.63. However, when I try to perform the test manually:

surp_IV1 <- lm(educ ~ nearc2 + nearc4 + exper + south + nonwhite, data=eitc)
surp_IV_fit <- surp_IV1$fitted.values
surp_IV2 <- lm(ln.wage ~ surp_IV_fit + exper + south + nonwhite, data=eitc)

surp_resid <- resid(surp_IV2)

test_surplus <- lm(surp_resid ~ nearc2 + nearc4 + exper + south + nonwhite,
                   data = eitc)

summary(test_surplus)

With R-Squared = 0.0008032 on 3,010 observations, I get a test statistic of 2.42.

What is the reason for the difference?

Original Q&A

There are 1 best solutions below

**Zeki Akyol** · Answer 1

I guess some of the steps are not necessary.

The procedure is taken from 15-5b Testing Overidentification Restrictions in (Wooldridge, 2019, 7th Edition).

library(wooldridge) # data(card)
library(dplyr)      # rename()
library(ivreg)      # ivreg()

data(card)

eitc <- card |> 
  rename(nonwhite = black)

TSLS <- ivreg(lwage ~ educ + exper + south + nonwhite 
              | nearc4 + nearc2 + exper + south + nonwhite,
              data = eitc)

summary(TSLS, diagnostics = TRUE)
#> 
#> Call:
#> ivreg(formula = lwage ~ educ + exper + south + nonwhite | nearc4 + 
#>     nearc2 + exper + south + nonwhite, data = eitc)
#> 
#> Residuals:
#>       Min        1Q    Median        3Q       Max 
#> -2.309361 -0.319674  0.007403  0.334821  1.783133 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  2.17357    0.70494   3.083  0.00207 ** 
#> educ         0.24131    0.04089   5.901 4.02e-09 ***
#> exper        0.10453    0.01660   6.296 3.50e-10 ***
#> south       -0.08534    0.02647  -3.224  0.00128 ** 
#> nonwhite    -0.01541    0.04644  -0.332  0.74009    
#> 
#> Diagnostic tests:
#>                   df1  df2 statistic  p-value    
#> Weak instruments    2 3004    19.682 3.22e-09 ***
#> Wu-Hausman          1 3004    27.580 1.61e-07 ***
#> Sargan              1   NA     1.626    0.202    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 0.4998 on 3005 degrees of freedom
#> Multiple R-Squared: -0.2668, Adjusted R-squared: -0.2685 
#> Wald test:  87.8 on 4 and 3005 DF,  p-value: < 2.2e-16

TSLS_resid <- resid(TSLS)

surp_IV1 <- lm(TSLS_resid ~ nearc2 + nearc4 + exper + south + nonwhite,
               data = eitc)

nobs(surp_IV1) * summary(surp_IV1)[["r.squared"]] # number of observations * R-squared
#> [1] 1.625811

^{Created on 2023-12-15 with reprex v2.0.2}

R Sargan Test Results Contrast with Manual Approach

There are 1 best solutions below

Related Questions in R

Related Questions in REGRESSION

Related Questions in DIAGNOSTIC-TOOLS

Trending Questions

Popular # Hahtags

Popular Questions