How can I use a for loop to run regression?

275 Views Asked by At

My current dataset looks like:

N = 10000
wage <- rnorm(N)
educ <- rnorm(N)
age  <- rnorm(N)
tce  <- rnorm(N)

work <- rbinom(n = N, size = 1, prob = 0.05)
manu <- rbinom(n = N, size = 1, prob = 0.05)

id <- sample(10, N, replace = TRUE)

df <- data.frame(wage, educ, age, tce, work, manu, id)

wage, work and manu are my dependent variables, and the rest of my variables are my independent variables.

Currently, I am repeating the syntax, but just changing the outcome variable as such:

library(fixest)

model1 <- feols(work ~ educ + age + tce | id, data = df)

model2 <- feols(manu ~ educ + age + tce | id, data = df)

model2 <- feols(wage~ educ + age + tce | id, data = df)

Is there a way I can use a for loop to run such regressions?

Moreover, after running the regressions, I would also like to plot the coefficients of the regressions as such:

library(modelsummary)

modelplot(
 list(model1, model2, model3)
 )

However, since for-loops don't create new objects how can I plot the coefficients?

2

There are 2 best solutions below

0
On BEST ANSWER

Multiple estimations is a built-in functionality in fixest. Use c(v1, v2) to run regressions across multiple dependent variables. By the way, this will also be substantially faster than looping.

est_multi = feols(c(work, manu, wage) ~ educ + age + tce | id, df)
etable(est_multi)
#>                          model 1          model 2           model 3
#> Dependent Var.:             work             manu              wage
#>                                                                    
#> educ            -0.0060 (0.0031) -0.0009 (0.0023)   0.0204 (0.0139)
#> age              0.0018 (0.0028)  0.0003 (0.0030)   0.0092 (0.0053)
#> tce             -0.0013 (0.0027)  0.0036 (0.0020) -0.0174. (0.0075)
#> Fixed-Effects:  ---------------- ---------------- -----------------
#> id                           Yes              Yes               Yes
#> _______________ ________________ ________________ _________________
#> S.E.: Clustered           by: id           by: id            by: id
#> Observations              10,000           10,000            10,000
#> R2                       0.00117          0.00038           0.00175
#> Within R2                0.00083          0.00031           0.00082
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Note that if you have the variables names in a vector, you can plug them in directly in the formula thanks to the dot-square-bracket operator:

depvars = c("work", "manu", "wage")
est_multi_bis = feols(.[depvars] ~ work ~ educ + age + tce | id, df)

You can find some documentation on multiple estimations in the dedicated vignette.

2
On

I can't replicate your example with the provided code. Neverthless, you can use a loop like this:

variable <- c("work", "manu", "wage")
datalist <- list()

for(i in variable) {
  formula <- as.formula(paste(i, " ~ educ + age + tce | id"))
  model <- feols(formula, data = df)
  datalist[[i]] <- model
}

The model for each condition will be saved in a list that you can access or extract as an object.