While working on examining interactions in a set of linear models, my R session crashed and restarted. I work from a .rmd file, so it was easy to restore my work up to that point, and I verified all changes to the .rmd were up to date before the restart.
Before the crash, all models converged without errors or warnings, but after R and Studio restarted, some, but not all, models failed to converge with the error message:
Model failed to converge with max|grad| = 0.00247628 (tol = 0.002, component 1))
The models come from one of two subsets of my full dataset, but there is no consistency as to which models converged post-crash and which did not on the basis of which subset of the data they were working from. The full data set contains ~20,000 observations, while the two subsets contain ~13,000 each.
Here is an example of two models, the first of which converged both pre- and post-crash, and the second of only converged pre-crash:
mod1<-lmer(RT ~ var1*var2*var3*var4*var5+(1+var1|Subject) + (1|item),data=df1)
mod2<-lmer(RT ~ var1*var2*var3*var4*var5+(1+var1|Subject) + (1|item),data=df2)
- var1 is a 3 level factor, but the data is subset on this factor such that only two levels exist in a given dataset/model.
- var2 is a two level factor
- var3 is a continuous numerical predictor
- var4 is a 4 level factor
- var5 is a 2 level factor
In the case of the convergence errors that occurred post-crash and restart of R, I can fix the issue by simplifying the random effects structure and removing the slope (i.e. (1|Subject) instead of (1+var1|Subject).
My questions are about the nature of the issue and the solution. Why did some models stop converging and returning this specific error message after R restarted and I restored all work upon to the point before the restart? Why does removing the random slope solve this issue despite the random slope being fine in all models before the crash?
Thanks, and please let me know if I can provide any more details about the models, dataset, or anything else that may be relevant.
EDIT: To be more clear on the way in which the data was subset, the full data set was filtered in two distinct ways based on the value of var1. Models were run either on data in which var1 had values A & B, or on data in which var1 had values A & C.