fixed-effect model matrix is rank deficient so dropping 27 columns / coefficients AND boundary (singular)

Question

fixed-effect model matrix is rank deficient so dropping 27 columns / coefficients AND boundary (singular)

1.1k Views Asked by cpc At 27 July 2025 at 17:18

Study background: I want to see if mean caterpillar abundance for a given year and within a given population, can explain differences in bird(blue tit) density. The blue tits breed in nest boxes, so I calculated density as the total number of occupied nest boxes/the number of unoccupied nest boxes within a given year and population.

Below I show the structure of the data, my model and the error messages.

Model:

model1.1 <-glmer(cbind(data.density$number.nest.boxes.occupied.that.year,
          data.density$number_of_nest.boxes)~population*year*caterpillar+(1|site),
   data = data.density, family=binomial)

The first error is:

fixed-effect model matrix is rank deficient so dropping 27 columns / coefficients

I think this this is due to not having enough combinations of caterpillars with population x year.

The second error is

boundary (singular) fit: see ?isSingular

I'm just not sure how to go about fixing this. I also don't understand what the other error means and how to fix it.

I appreciate any advice.

#loading data density data

data.density<-read.csv ("nest_box_caterpillar_density.csv") 
View(data.density)


str(data.density)
#> 'data.frame':    63 obs. of  8 variables:
#>  $ year                                : int  2011 2012 2013 2014 2015 2016 2017 2018 2019 2011 ...
#>  $ number.nest.boxes.occupied.that.year: int  17 13 12 16 16 16 15 17 12 17 ...
#>  $ number_of_nest.boxes                : int  20 20 20 20 20 20 20 20 20 30 ...
#>  $ proportion_occupied_boxes           : num  0.85 0.65 0.6 0.8 0.8 0.8 0.75 0.85 0.6 0.57 ...
#>  $ site                                : Factor w/ 7 levels "ari","ava","fel",..: 5 5 5 5 5 5 5 5 5 1 ...
#>  $ population                          : Factor w/ 3 levels "D-Muro","E-Muro",..: 2 2 2 2 2 2 2 2 2 2 ...
#>  $ mean_yearly_frass                   : num  295 231 437 263 426 ...
#>  $ site_ID                             : Factor w/ 63 levels "2011_ari_","2011_ava_",..: 5 12 19 26 33 40 47 54 61 1 ...

data.density$year<-factor (data.density$year)# making year a factor (categorical variable)

str(data.density) # now we see year as a factor in the data. 
#> 'data.frame':    63 obs. of  8 variables:
#>  $ year                                : Factor w/ 9 levels "2011","2012",..: 1 2 3 4 5 6 7 8 9 1 ...
#>  $ number.nest.boxes.occupied.that.year: int  17 13 12 16 16 16 15 17 12 17 ...
#>  $ number_of_nest.boxes                : int  20 20 20 20 20 20 20 20 20 30 ...
#>  $ proportion_occupied_boxes           : num  0.85 0.65 0.6 0.8 0.8 0.8 0.75 0.85 0.6 0.57 ...
#>  $ site                                : Factor w/ 7 levels "ari","ava","fel",..: 5 5 5 5 5 5 5 5 5 1 ...
#>  $ population                          : Factor w/ 3 levels "D-Muro","E-Muro",..: 2 2 2 2 2 2 2 2 2 2 ...
#>  $ mean_yearly_frass                   : num  295 231 437 263 426 ...
#>  $ site_ID                             : Factor w/ 63 levels "2011_ari_","2011_ava_",..: 5 12 19 26 33 40 47 54 61 1 ...



density<-data.density$proportion_occupied_boxes # making a new object called density
caterpillar<-data.density$mean_yearly_frass # making new object called caterpillar

model1.1<-glmer(cbind(data.density$number.nest.boxes.occupied.that.year,data.density$number_of_nest.boxes)~population*year*caterpillar+(1|site),data = data.density, family=binomial)  
#> fixed-effect model matrix is rank deficient so dropping 27 columns / coefficients
#> boundary (singular) fit: see ?isSingular

summary(model1.1)
#> Generalized linear mixed model fit by maximum likelihood (Laplace
#>   Approximation) [glmerMod]
#>  Family: binomial  ( logit )
#> Formula: 
#> cbind(data.density$number.nest.boxes.occupied.that.year, data.density$number_of_nest.boxes) ~  
#>     population * year * caterpillar + (1 | site)
#>    Data: data.density
#> 
#>      AIC      BIC   logLik deviance df.resid 
#>    343.7    403.7   -143.8    287.7       35 
#> 
#> Scaled residuals: 
#>     Min      1Q  Median      3Q     Max 
#> -1.1125 -0.1379  0.0000  0.2264  0.6778 
#> 
#> Random effects:
#>  Groups Name        Variance Std.Dev.
#>  site   (Intercept) 0        0       
#> Number of obs: 63, groups:  site, 7
#> 
#> Fixed effects:
#>                              Estimate Std. Error z value Pr(>|z|)  
#> (Intercept)                -0.4054532  0.2454754  -1.652   0.0986 .
#> populationE-Muro           -0.1158123  0.2301030  -0.503   0.6147  
#> populationE-Pirio          -0.4945158  0.2932707  -1.686   0.0918 .
#> year2012                    0.0905137  0.2109513   0.429   0.6679  
#> year2013                   -0.1223076  0.2160367  -0.566   0.5713  
#> year2014                   -0.0703760  0.2304236  -0.305   0.7600  
#> year2015                   -0.0507882  0.2127083  -0.239   0.8113  
#> year2016                   -0.0562139  0.2077616  -0.271   0.7867  
#> year2017                   -0.0994962  0.2070464  -0.481   0.6308  
#> year2018                    0.0977751  0.2192755   0.446   0.6557  
#> year2019                   -0.2312869  0.2133430  -1.084   0.2783  
#> caterpillar                 0.0004598  0.0005432   0.846   0.3973  
#> populationE-Muro:year2012  -0.1217344  0.3294773  -0.369   0.7118  
#> populationE-Pirio:year2012 -0.3121173  0.2912256  -1.072   0.2838  
#> populationE-Muro:year2013  -0.0682892  0.3600992  -0.190   0.8496  
#> populationE-Pirio:year2013 -0.3345701  0.3051039  -1.097   0.2728  
#> populationE-Muro:year2014   0.1604636  0.3383121   0.474   0.6353  
#> populationE-Pirio:year2014 -0.1074231  0.3171972  -0.339   0.7349  
#> populationE-Muro:year2015   0.0838557  0.3491699   0.240   0.8102  
#> populationE-Pirio:year2015 -0.0640988  0.2943189  -0.218   0.8276  
#> populationE-Muro:year2016   0.0679017  0.3333771   0.204   0.8386  
#> populationE-Pirio:year2016 -0.0899343  0.2919975  -0.308   0.7581  
#> populationE-Muro:year2017   0.1643493  0.3300491   0.498   0.6185  
#> populationE-Pirio:year2017  0.0338824  0.2730344   0.124   0.9012  
#> populationE-Muro:year2018   0.0315607  0.3264224   0.097   0.9230  
#> populationE-Pirio:year2018 -0.4196974  0.3180515  -1.320   0.1870  
#> populationE-Muro:year2019  -0.0587593  0.3619408  -0.162   0.8710  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Correlation matrix not shown by default, as p = 27 > 12.
#> Use print(x, correlation=TRUE)  or
#>     vcov(x)        if you need it
#> fit warnings:
#> fixed-effect model matrix is rank deficient so dropping 27 columns / coefficients
#> optimizer (Nelder_Mead) convergence code: 0 (OK)
#> boundary (singular) fit: see ?isSingular

^{Created on 2022-03-21 by the reprex package (v2.0.1)}

I tried removing caterpillars from the model, and the error first error goes away. But the point of my model is to see how caterpillar effects density. I also still get the "boundary (singular) fit: see ?isSingular" error.

Original Q&A

There are 1 best solutions below

**Ben Bolker** · Answer 1

There are a few problems with your model specification.

binomial responses should be specified as cbind(n_success, n_failure), not cbind(n_success, n_total) (in your case cbind(boxes_occupied, total_boxes-boxes_occupied). (I actually find it clearer to use the alternative specification boxes_occupied/total_boxes with the additional argument weights=total_boxes ...)
there are few absolutely hard and fast rules, but it probably makes sense to estimate the effects of at least year as a random effect, and possibly population within site (this will depend on how much you are interested in the detailed differences between populations)
with a total of 63 observations, you need to be parsimonious with your model; a reasonable rule of thumb is that you shouldn't try to estimate more than at most 6 parameters
so I would recommend something like

 glmer(boxes_occupied/total_boxes ~ caterpillar + population + (1|year) + (1|site) + caterpillar, 
    data = data.density, 
    weights = total_boxes,
    family = binomial)

As for singularity, this is a fundamental issue in mixed models, and has been discussed in lots of places. There's a lot of information provided at ?lme4::isSingular (i.e., look up the help page for the isSingular() function), and at the GLMM FAQ, or see some of the existing answers on Stack Overflow about singular fits ...

fixed-effect model matrix is rank deficient so dropping 27 columns / coefficients AND boundary (singular)

There are 1 best solutions below

Related Questions in R

Related Questions in MIXED-MODELS

Related Questions in SINGULAR

Trending Questions

Popular # Hahtags

Popular Questions