I am trying to use glm in R using a dataframe containing ~ 1000 columns, where I want to select a specific independent variable and run as a loop for each of the 1000 columns representing the dependent variables.
As a test, the glm equation works perfectly fine when I specify a single column using df$col1 for both my dependent and independent variables.
I can't seem to correctly subset a range of columns (below) and I keep getting this error, no matter how many ways I try to format the df:
'data' must be a data.frame, environment, or list
What I tried:
df = my df
cols <- df[, 20:1112]
for (i in cols{
glm <- glm(df$col1 ~ ., data=df, family=gaussian)
}
It would be more idiomatic to do:
In fact, if you really just want to do a Gaussian GLM then it will be slightly faster to use
in the loop instead.
If you want to get fancy: