Missing predictor variable computing Bayesian ANOVA using stan_aov from rstanarm package

94 Views Asked by At

I am computing a Bayesian ANOVA to investigate how my heading variable affects FirstSteeringTime. Here is an example of my data set:

x <- structure(list(FirstSteeringTime = c(0.433389999999999, 0.449999999999989, 
0.383199999999988, 0.499899999999997, 0.566800000000001, 0.58329999999998, 
0.5, 0.449799999999982, 0.566600000000022, 0.466700000000003, 
0.433499999999981, 0.466799999999978, 0.549900000000036, 0.483499999999992, 
0.533399999999972, 0.433400000000006, 0.533200000000022, 0.450799999999999, 
0.45022, 0.46651, 0.68336, 0.483400000000003, 0.5167, 0.383519999999997, 
0.583200000000005, 0.449999999999989, 0.58329999999998, 0.4999, 
0.5334, 0.4666, 0.433399999999978, 0.41670000000002, 0.416600000000017, 
0.45010000000002, 0.666700000000048, 0.433399999999949, 0.466700000000003, 
0.666600000000017, 0.516800000000046, 0.199900000000014, 0.400039999999997, 
0.150100000000009, 0.583399999999983, 0.483400000000017, 0.400099999999952, 
0.666600000000017, 0.434087937888119, 0.516692671379801, 0.533482992494996, 
0.516702632558399), pNum = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 10L, 10L, 10L, 10L), heading = structure(c(4L, 
1L, 4L, 3L, 4L, 3L, 4L, 3L, 4L, 1L, 4L, 4L, 4L, 4L, 4L, 3L, 3L, 
3L, 4L, 4L, 2L, 3L, 4L, 2L, 1L, 2L, 1L, 3L, 1L, 2L, 4L, 3L, 3L, 
4L, 2L, 4L, 4L, 1L, 3L, 4L, 3L, 4L, 3L, 3L, 2L, 3L, 2L, 2L, 2L, 
3L), .Label = c("0.5", "1", "1.5", "2"), class = "factor")), row.names = c(NA, 
50L), class = "data.frame")

First I fit my Bayesian ANOVA model:

op <- options(contrasts = c("contr.helmert", "contr.poly"))

fit_aov <- stan_aov(FirstSteeringTime ~ heading, data = x, prior = R2(0.5))

My model shows me I have 4 predictors. This is correct as I have for levels to my heading variable - 0.5, 1, 1.5 and 2:


fit_aov

stan_aov
 family:       gaussian [identity]
 formula:      FirstSteeringTime ~ heading
 observations: 50
 predictors:   4
------
            Median MAD_SD
(Intercept) 0.5    0.0   
heading1    0.0    0.0   
heading2    0.0    0.0   
heading3    0.0    0.0   

Auxiliary parameter(s):
              Median MAD_SD
R2            0.1    0.1   
log-fit_ratio 0.0    0.1   
sigma         0.1    0.0   

ANOVA-like table:
                Median MAD_SD
Mean Sq heading 0.0    0.0   

Sample avg. posterior predictive distribution of y:
         Median MAD_SD
mean_PPD 0.5    0.0   

However when I compute my region of practical equivalence (ROPE) interval and compare it to my HDIs, I only seem to have three predictors showing?

pd <- p_direction(fit_aov)
percentage_in_rope <- rope(fit_aov, ci = 1)

# Visualise the pd
plot(pd)

pd

 Parameter      pd
   (Intercept) 100.00%
      heading1  80.35%
      heading2  79.47%
      heading3  98.78%
 log-fit_ratio  60.48%
            R2 100.00%

HDIs for Probability of Direction Now my first thought was that one level of heading variable might have a very small effect and thus it wouldn't be big enough for a HDI to be created? But I'm not sure. Does anybody have any ideas? Also can someone explain to me what the log-fit ratio/R2 are and what information they are telling me?

Any help is most appreciated!

1

There are 1 best solutions below

0
On

After some thinking, perhaps the different HDIs correspond to the sequential nature of accumulating data and reconstructing my HDIs? i.e. I start with one one level of my predictor and I add another for each HDI. As I add them, my HDIs become smaller because I can be more confident in the probable values of my effect size parameters? Just some thoughts from me.