Put all possible broom::glance statistics of lm() model combinations with 9 variables into a dataframe in R

345 Views Asked by At

As I am just learning R, I am not sure how to solve this. I am trying to get a data frame that shows me the following:

Model Number | adj.r.squared | sigma   | statistic | df 
------------------------------------------------------
Model 1      | 0.465         | 0.437   |  459.0.   | 8
Model 2      | 0.0465        | 0.0437  |  659.0.   | 7

I am using the broom package in order to get these statistics with glance() and created a function for it:

glancing <- function(x) {
  glance(x)[c("adj.r.squared", "sigma", "statistic", "df")]
}

I am using a dataset that has 9 variables ("danceability","energy", "loudness", "speechiness", "acousticness", "liveness", "valence", "tempo", "instrumentalness) and I needed all the combination possible for linear regression to predict the popularity score

I found a way to put all the formulas in a list:

characteristics <- c("popularity","danceability","energy", "loudness", "speechiness", "acousticness", "liveness", "valence", "tempo", "instrumentalness")
N <- list(1,2,3,4,5,6,7,8,9)
COMB <- sapply(N, function(m) combn(x=characteristics[2:10], m))
formulas <- list()
k=0
for(i in seq(COMB)){
  tmp <- COMB[[i]]
  for(j in seq(ncol(tmp))){
    k <- k + 1
    formulas[[k]] <- formula(paste("popularity", "~", paste(tmp[,j], collapse=" + ")))
  }
}

I was also able to assign each formula in the list to an object with the linear model:

#Assign each model to a variables 
for(i in 1:length(formulas)) {                    
  assign(paste0("model",i),lm(formulas[[i]], data=training_data))
}

This leaves me with 511 models (objects), which I have to put into the glancing function manually, and then combine into a data frame.

Is there an easier way of doing this altogether?

I already tried to convert the list into a data frame or vector, but it seems to fail due to the fact the class is a "formula".

Your help is appreciated!

1

There are 1 best solutions below

0
On BEST ANSWER

Replace this loop using assign:

for(i in 1:length(formulas)) {                    
  assign(paste0("model",i),lm(formulas[[i]], data=training_data))
}

With this loop using a list:

model_list = list()
for(i in 1:length(formulas)) {                    
  model_list[[i]] = lm(formulas[[i]], data=training_data)
}

Then if you want to glance all of them:

library(dplyr)
library(broom)
glance_results = bind_rows(lapply(model_list, glance))