Lets say I have some data on the relative abundance of 3 species. Species 1 has a negative relationship with mean annual temperature (mat
). Relative abundance data are stored in the matrix, r.spp.y
:
#Simulate some species count data.
y1 <- round(rnorm(100, 100, 5))
mat <- rnorm(length(y1), 10, 3)
y1 <- y1 + round(mat*-3)
y2 <- round(rnorm(100, 100, 5))
y3 <- round(rnorm(100, 100, 5))
spp.y <- as.matrix(data.frame(y1,y2,y3))
r.spp.y <- spp.y / rowSums(spp.y)
I can fit a beta regression to each species one at a time in JAGS to show that there is negative correlation between mat
and the relative abundance of species 1, but a positive correlation between mat
and the relative abundances of species 2 and 3 using JAGS:
library(runjags)
beta.model = "
model{
#priors
m0 ~ dnorm(0, 1.0E-3) #intercept prior
m1 ~ dnorm(0, 1.0E-3) # mat prior
t0 ~ dnorm(0, .01)
tau <- exp(t0)
#drop beta
for (i in 1:N){
y[i] ~ dbeta(p[i], q[i])
p[i] <- mu[i] * tau
q[i] <- (1 - mu[i]) * tau
logit(mu[i]) <- m0 + m1 * mat[i]
}
}#close model loop.
"
output <- list()
for(i in 1:ncol(r.spp.y)){
beta.data <- list(y = r.spp.y[,i], mat = mat, N = nrow(r.spp.y))
jags.beta <- run.jags(beta.model,
data=beta.data,
adapt = 200,
burnin = 1000,
sample = 1000,
n.chains = 3,
monitor = c('m0','m1'))
output[[i]] <- summary(jags.beta)
}
names(output) <- c('species 1','species 2','species 3')
This shows each has an intercept value and some relationship with mat
. Specifically, species 1 has a negative relationship with mat
while species 2 and 3 exhibit a positive relationship with mat
which is reflected in the values of the m1
parameter.
$`species 1`
Lower95 Median Upper95 Mean SD Mode MCerr MC%ofSD SSeff AC.10 psrf
m0 -0.71012252 -0.66164521 -0.61324227 -0.66175203 0.025504290 -0.65828829 0.0025395302 10.0 101 0.4951512 1.005070
m1 -0.04608964 -0.04139645 -0.03671938 -0.04138209 0.002453942 -0.04148652 0.0002420001 9.9 103 0.5013018 1.006634
$`species 2`
Lower95 Median Upper95 Mean SD Mode MCerr MC%ofSD SSeff AC.10 psrf
m0 -0.74993189 -0.70794571 -0.67146076 -0.70773078 0.019593399 -0.70704999 0.0018992840 9.7 106 0.4913926 1.006231
m1 0.01479929 0.01849042 0.02212619 0.01844102 0.001845884 0.01825645 0.0001812411 9.8 104 0.4964613 1.006281
$`species 3`
Lower95 Median Upper95 Mean SD Mode MCerr MC%ofSD SSeff AC.10 psrf
m0 -0.71786559 -0.67713045 -0.63342213 -0.67726218 0.021507927 -0.67767494 0.001908073 8.9 127 0.4398105 1.024932
m1 0.01160666 0.01560139 0.01958145 0.01561193 0.002017519 0.01576153 0.000187649 9.3 116 0.4513887 1.026890
I would like to fit all abundances simultaneously using a multivariate model. This can be done using the dirlichet distribution, which is the multivariate generalization of the beta distribution. To do this I setup a dirlichet model in JAGS, analogous to the beta above:
dirlichet.model = "
model {
#setup priors for each species
for(j in 1:N.spp){
m0[j] ~ dnorm(0, 1.0E-3) #intercept prior
m1[j] ~ dnorm(0, 1.0E-3) # mat prior
}
#implement dirlichet
for(i in 1:N){
for(j in 1:N.spp){
logit(a0[i,j]) <- m0[j] + m1[j] * mat[i]
}
y[i,1:N.spp] ~ ddirch(a0[i,1:N.spp])
}
} #close model loop.
"
I then setup a slightly different data object, and run the model:
jags.data <- list(y = r.spp.y,mat = mat,N = nrow(r.spp.y), N.spp = ncol(r.spp.y))
jags.out <- run.jags(dirlichet.model,
data=jags.data,
adapt = 200,
burnin = 2000,
sample = 2000,
n.chains=3,
monitor=c('m0','m1'))
summary(jags.out)
However the m0
and m1
parameters returned in no way resemble the parameters returned by looping the beta model. It seems that all parameters reflect the supplied (flat) prior distributions, and the data have not constrained the parameters in any way. This model cannot capture the negative relationship between mat
and the relative abundance of species 1. It doesn't even seem to constrain the data very much at all:
> summary(jags.out)
Lower95 Median Upper95 Mean SD Mode MCerr MC%ofSD SSeff AC.10 psrf
m0[1] -49.005761 6.181091 63.01256 6.393293 28.43734 6.628375 0.5359853 1.9 2815 -0.0018363868 1.000590
m0[2] -45.469227 6.723323 67.85340 6.978617 28.52249 5.924171 0.5417058 1.9 2772 0.0129357542 1.000718
m0[3] -49.227870 6.233448 61.60683 6.599494 28.43137 4.742497 0.5279870 1.9 2900 -0.0057574414 1.000546
m1[1] -1.869337 24.268826 64.16340 27.391725 19.13417 20.322548 0.4572891 2.4 1751 0.0109859218 1.006568
m1[2] -1.465451 23.725515 64.21979 27.071232 18.98391 15.823296 0.4294849 2.3 1954 -0.0005437107 1.002363
m1[3] -1.907332 24.031360 62.18261 27.017599 18.57743 18.597688 0.4290431 2.3 1875 -0.0086604393 1.001171
It is worth noting that if I fit an intercept-only dirlichet model (without the mat
covariate) it can capture the relative abundances well, however only if I use a log-link, rather than the logit-link used in the beta case (example here). Extending the log-link case and adding the mat
covariate does not solve the problem. What am I missing here?