How do I work around ragged data in my bayesian model in WINBUGS? - Coding Issue

31 Views Asked by At

Firstly, I am a student very new to what I am doing. I'd like to say that the model I am building is very simplistic at this stage. Please don't judge my work for what you see here as it's totally incomplete and terrible right now and I know it... I am just asking for coding help.

I built this model in WINBUGS to get me going. It iterates over a 1000 observations and estimates beta0 and beta1. From what I understand, in this model:

  • beta0 is baseline risk of disease when there is no exposure
  • beta1 is the change in the odds of disease occurrence for a one-unit increase in the exposure variable
model {

  # Likelihood 
  for (i in 1:999) {
    disease[i] ~ dbern(p[i])                        # prob of disease given exposure
    p[i] <- 1 / (1 + exp(- (beta0 + (beta1 * exposure[i]))))  # inverse logit function 
  }
  # Prior for beta 
  beta0 ~ dnorm(0, 1.0E-4)
  beta1 ~ dnorm(0, 1.0E-4)

}

Init Values 
list(beta0=0, beta1=0)

Data 
exposure[]  disease[]
0.3 0
0.4 0
0.6 0
...

This model works in WINBUGS for me. Now, I need to extend this model further because I have actually ragged data which I have not considered in the model above.

Details about my ragged data:

  • 1000 observations for 50 individuals
  • observations per individual can range anywhere from 1 to 32. One person could have 1 obs, another 3 obs and so on...

** What does my actual data look like? **

ID Exposure Disease
1 0.1 1
1 0.2 1
1 0.4 1
2 0.4 0
2 0.9 0
3 0.1 1
3 0.3 1
3 0.6 1
4 0.2 1

**Places I've already been for help with little success: ** https://costa.eeb.ucsc.edu/wp-content/uploads/2015/10/hierarchical-slides.pdf https://www.mrc-bsu.cam.ac.uk/wp-content/uploads/manual14.pdf and more, but these were most relevant to my goal.

What am I trying to achieve at this stage?

Simply, a model in which I want to

  1. Sum up my exposures for each individual
  2. Use the summed-up exposures as "new" exposure values in the model I've shown above
  3. Get betas and probability of disease as I've done with the model I've shown above...

Now, I know I could just sum up my exposure values for each individual using R and pre-process the data, but I am purposely trying to avoid this. The idea is to code a model that incorporates this even though it is unnecessarily complex.

Anyway, so what have I actually tried?

Tried to add this:

    for (i in 1:50) {
        sumexposure[i] <- sum(exposure[id == i]) 
        }

It didn't work. Error: "Expected a left square bracket" (sorry I forgot what it said exactly)...

I thought I might declare the new variable sumexposure

    for (i in 1:50) {
            sumexposure[i] <- 0 
            }

    for (i in 1:50) {
        sumexposure[i] <- sum(sumexposure[id == i]) 
        }

It didnt work. Error: "Expected a left square bracket" (Again, sorry I forgot what it said exactly)

The list goes on of variations of the codes above. I've tried this too:

for(i in 1:50) {
for(j in 1:32) {
   sumexposure[i] <- sum(exposure[i,j])
}
}

I think this is just wrong completely to be honest.

for (i in 1:50) {
  sumexposure[i] <- 0
}

# Loop for individuals and exposure observations
for (i in 1:50) {
  for (j in 1:32) {
    sumexposure[i] <- sumexposure[i] + exposure[i, j]
  }
}

In the future, I want to avoid this and just account for the varying number of observations per individual using bayesian hierarchal models but for now, I really just need to work this part out. I apologise if anything above is confusing and am willing to have questions to clarify.

0

There are 0 best solutions below