"saturated likelihood may be inaccurate" warning and negative deviance when running betar family in GAM

226 Views Asked by At

My code when running the generalized additive model with the betar family is as follow.

libary(mgcv)
b1 <- gam(ssim_exp ~ s(stage, k = 4, fx = TRUE, by = comparison_type) + comparison_type, data = df, family = betar(link = "logit", eps=.Machine$double.eps*1000))

Output

saturated likelihood may be inaccurate
Family: Beta regression(0.434) 
Link function: logit 

Formula:
ssim_exp_scale ~ s(stage, k = 4, fx = TRUE, by = comparison_type) + 
    comparison_type

Parametric coefficients:
                         Estimate Std. Error z value Pr(>|z|)    
(Intercept)               -0.5572     0.1607  -3.468 0.000524 ***
comparison_typefunctions   2.0598     0.1988  10.362  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Approximate significance of smooth terms:
                                  edf Ref.df Chi.sq  p-value    
s(stage):comparison_typecomplete    3      3  19.07 0.000265 ***
s(stage):comparison_typefunctions   3      3   0.88 0.830160    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

R-sq.(adj) =  -0.00757   Deviance explained = -16.4%
-REML = -1035.1  Scale est. = 1         n = 171
saturated likelihood may be inaccuratesaturated likelihood may be inaccurate

I tried decreasing the eps but I still get the same warning "saturated likelihood may be inaccurate" and negative deviance, any idea why? And how to fix this?

For context - I do have some 0s and 1s in the data and my dependent variable is in the form of percentage from 0 - 100%, rescaled to 0 and 1. My dependent variable is a similarity measure like Jaccard similarity - https://www.learndatasci.com/glossary/jaccard-similarity/ .

This is the distribution of the dependent variable of my data

enter image description here

0

There are 0 best solutions below