I am using ANOVA to analyse results from an experiment to see whether there are any effects of my explanatory variables (Heating and Dungfauna) on my response variable (Biomass). I started by looking at the main effects and interaction:
full.model <- lm(log(Biomass) ~ Heating*Dungfauna, data= df)
anova(full.model)
I understand that it is necessary to complete model simplification, removing non-significant interactions or effects to eventually reach the simplest model which still explains the results. I tried two ways of removing the interaction. However, when I manually remove the interaction (Heating*Fauna
-> Heating+Fauna
), the new ANOVA gives a different output to when I use this model simplification 'shortcut':
new.model <- update(full.model, .~. -Dungfauna:Heating)
anova(model)
Which way is the appropriate way to remove the interaction and simplify the model?
In both cases the data is log transformed -
lm(log(CC_noAcari_EmergencePatSoil)~ Dungfauna*Heating, data= biomass)
ANOVA output from manually changing Heating*Dungfauna
to Heating+Dungfauna
:
Response: log(CC_noAcari_EmergencePatSoil)
Df Sum Sq Mean Sq F value Pr(>F)
Heating 2 4.806 2.403 5.1799 0.01012 *
Dungfauna 1 37.734 37.734 81.3432 4.378e-11 ***
Residuals 39 18.091 0.464
ANOVA output from using simplification 'shortcut':
Response: log(CC_noAcari_EmergencePatSoil)
Df Sum Sq Mean Sq F value Pr(>F)
Dungfauna 1 41.790 41.790 90.0872 1.098e-11 ***
Heating 2 0.750 0.375 0.8079 0.4531
Residuals 39 18.091 0.464
R's
anova
andaov
functions compute the Type I or "sequential" sums of squares. The order in which the predictors are specified matters. A model that specifiesy ~ A + B
is asking for the effect of A conditioned on B, whereasY ~ B + A
is asking for the effect of B conditioned on A. Notice that your first model specifiesDungfauna*Heating
, while your comparison model usesHeating+Dungfauna
.Consider this simple example using the "mtcars" data set. Here I specify two additive models (no interactions). Both models specify the same predictors, but in different orders:
You could specify Type II or Type III sums of squares using
car::Anova
:Which gives the same result for both models:
summary
also provides equivalent (and consistent) metrics regardless of the order of predictors, though it's not quite a formal ANOVA table: