trouble fitting a dirlichet model to simulated data in JAGS, implemented in R (beta works)

109 Views Asked by At

Lets say I have some data on the relative abundance of 3 species. Species 1 has a negative relationship with mean annual temperature (mat). Relative abundance data are stored in the matrix, r.spp.y:

#Simulate some species count data.
y1 <- round(rnorm(100, 100, 5)) 
mat <- rnorm(length(y1), 10, 3)
y1 <- y1 + round(mat*-3)
y2 <- round(rnorm(100, 100, 5)) 
y3 <- round(rnorm(100, 100, 5)) 
spp.y <- as.matrix(data.frame(y1,y2,y3))
r.spp.y <- spp.y / rowSums(spp.y)

I can fit a beta regression to each species one at a time in JAGS to show that there is negative correlation between mat and the relative abundance of species 1, but a positive correlation between mat and the relative abundances of species 2 and 3 using JAGS:

library(runjags)
beta.model = "
model{
  #priors
  m0 ~ dnorm(0, 1.0E-3) #intercept prior
  m1 ~ dnorm(0, 1.0E-3) #      mat prior
  t0 ~ dnorm(0, .01)
  tau <- exp(t0) 

  #drop beta
  for (i in 1:N){
  y[i] ~ dbeta(p[i], q[i])
  p[i] <- mu[i] * tau
  q[i] <- (1 - mu[i]) * tau
  logit(mu[i]) <- m0 + m1 * mat[i]
  }

}#close model loop.
"
output <- list()
for(i in 1:ncol(r.spp.y)){
  beta.data <- list(y = r.spp.y[,i], mat = mat, N = nrow(r.spp.y))
  jags.beta <- run.jags(beta.model,
                        data=beta.data,
                        adapt = 200,
                        burnin = 1000,
                        sample = 1000,
                        n.chains = 3,
                        monitor = c('m0','m1'))
  output[[i]] <- summary(jags.beta)
}
names(output) <- c('species 1','species 2','species 3')

This shows each has an intercept value and some relationship with mat. Specifically, species 1 has a negative relationship with mat while species 2 and 3 exhibit a positive relationship with mat which is reflected in the values of the m1 parameter.

$`species 1`
       Lower95      Median     Upper95        Mean          SD        Mode        MCerr MC%ofSD SSeff     AC.10     psrf
m0 -0.71012252 -0.66164521 -0.61324227 -0.66175203 0.025504290 -0.65828829 0.0025395302    10.0   101 0.4951512 1.005070
m1 -0.04608964 -0.04139645 -0.03671938 -0.04138209 0.002453942 -0.04148652 0.0002420001     9.9   103 0.5013018 1.006634

$`species 2`
       Lower95      Median     Upper95        Mean          SD        Mode        MCerr MC%ofSD SSeff     AC.10     psrf
m0 -0.74993189 -0.70794571 -0.67146076 -0.70773078 0.019593399 -0.70704999 0.0018992840     9.7   106 0.4913926 1.006231
m1  0.01479929  0.01849042  0.02212619  0.01844102 0.001845884  0.01825645 0.0001812411     9.8   104 0.4964613 1.006281

$`species 3`
       Lower95      Median     Upper95        Mean          SD        Mode       MCerr MC%ofSD SSeff     AC.10     psrf
m0 -0.71786559 -0.67713045 -0.63342213 -0.67726218 0.021507927 -0.67767494 0.001908073     8.9   127 0.4398105 1.024932
m1  0.01160666  0.01560139  0.01958145  0.01561193 0.002017519  0.01576153 0.000187649     9.3   116 0.4513887 1.026890

I would like to fit all abundances simultaneously using a multivariate model. This can be done using the dirlichet distribution, which is the multivariate generalization of the beta distribution. To do this I setup a dirlichet model in JAGS, analogous to the beta above:

dirlichet.model = "
model {
  #setup priors for each species
  for(j in 1:N.spp){
    m0[j] ~ dnorm(0, 1.0E-3) #intercept prior
    m1[j] ~ dnorm(0, 1.0E-3) #      mat prior
  }

  #implement dirlichet
  for(i in 1:N){
    for(j in 1:N.spp){
      logit(a0[i,j]) <- m0[j] + m1[j] * mat[i]
  }
  y[i,1:N.spp] ~ ddirch(a0[i,1:N.spp]) 
  }

} #close model loop.
"

I then setup a slightly different data object, and run the model:

jags.data <- list(y = r.spp.y,mat = mat,N = nrow(r.spp.y), N.spp = ncol(r.spp.y))
jags.out <- run.jags(dirlichet.model,
                     data=jags.data,
                     adapt = 200,
                     burnin = 2000,
                     sample = 2000,
                     n.chains=3,
                     monitor=c('m0','m1'))
summary(jags.out)

However the m0 and m1 parameters returned in no way resemble the parameters returned by looping the beta model. It seems that all parameters reflect the supplied (flat) prior distributions, and the data have not constrained the parameters in any way. This model cannot capture the negative relationship between mat and the relative abundance of species 1. It doesn't even seem to constrain the data very much at all:

> summary(jags.out)
         Lower95    Median  Upper95      Mean       SD      Mode     MCerr MC%ofSD SSeff         AC.10     psrf
m0[1] -49.005761  6.181091 63.01256  6.393293 28.43734  6.628375 0.5359853     1.9  2815 -0.0018363868 1.000590
m0[2] -45.469227  6.723323 67.85340  6.978617 28.52249  5.924171 0.5417058     1.9  2772  0.0129357542 1.000718
m0[3] -49.227870  6.233448 61.60683  6.599494 28.43137  4.742497 0.5279870     1.9  2900 -0.0057574414 1.000546
m1[1]  -1.869337 24.268826 64.16340 27.391725 19.13417 20.322548 0.4572891     2.4  1751  0.0109859218 1.006568
m1[2]  -1.465451 23.725515 64.21979 27.071232 18.98391 15.823296 0.4294849     2.3  1954 -0.0005437107 1.002363
m1[3]  -1.907332 24.031360 62.18261 27.017599 18.57743 18.597688 0.4290431     2.3  1875 -0.0086604393 1.001171

It is worth noting that if I fit an intercept-only dirlichet model (without the mat covariate) it can capture the relative abundances well, however only if I use a log-link, rather than the logit-link used in the beta case (example here). Extending the log-link case and adding the mat covariate does not solve the problem. What am I missing here?

0

There are 0 best solutions below