I'm trying to estimate a Weibull PH model using the OpenBUGS algorithm but encounter with some problem on the result of the estimated parameter. The reason for writing this algorithm is because I will extend this study into other PH model that use different parametric distribution.
Below is the R code used to estimate the parameter. As an additional note, the data used is coming from the 'veteran' data set of the 'survival' package. While the prior distribution used for beta (coefficient for covariate) is following the Normal(0,0.0001) and weibull parameters are both following Gamma(10,10).
rm(list=ls())
library(R2OpenBUGS)
library(rjags)
library(coda)
library(MCMCvis)
library(survival)
# weibull proportional hazard
mod = function(){
for(i in 1:n){
ph.const[i] <- exp(beta*X[i]) # exp(xB)
H[i] <- pow(k*Time[i],alpha)*ph.const[i] #Cumulative hazard
logHaz[i] <- log(alpha*k*pow(k*Time[i],alpha-1)*ph.const[i])
logSurv[i] <- log(pow(exp(-pow(k*Time[i],alpha)),ph.const[i]))
# using zeros-trick
phi[i] <- 100000 - delta[i] * logHaz[i] - logSurv[i]
zeros[i]~dpois(phi[i])
}
# prior distribution
beta~dnorm(0,0.001)
alpha1~dgamma(10,10)
alpha <- 1/alpha1
k~dgamma(10,10)
}
model.file = "C:\\Users\\farea\\Documents\\RWorkspace\\model.txt"
write.model(mod,model.file)
n <- nrow(veteran)
t <- veteran$time
x <- veteran$age
d <- veteran$status
d.jags <- list(n=n,Time=t,X=x,delta=d)
p.jags <- c('beta','k','alpha')
i.jags <- function(){list(k=0.5,beta=-1,alpha1=4)}
jmod <- jags.model(file=model.file, data=d.jags, n.chains=3, inits=i.jags, n.adapt=1000)
update(jmod, n.iter=3000, by=1)
poster <- coda.samples(jmod, p.jags, n.iter=50000, thin=10)
gelman.diag(poster)
summary(poster)
The estimated parameter are as follows:
> gelman.diag(poster)
Potential scale reduction factors:
Point est. Upper C.I.
alpha 1 1
beta 1 1
k 1 1
Multivariate psrf
1
> summary(poster)
Iterations = 3010:53000
Thinning interval = 10
Number of chains = 3
Sample size per chain = 5000
1. Empirical mean and standard deviation for each variable,
plus standard error of the mean:
Mean SD Naive SE Time-series SE
alpha 1.1133 0.3924 0.003204 0.003171
beta -0.4234 31.6186 0.258164 0.260125
k 1.0011 0.3185 0.002601 0.002565
2. Quantiles for each variable:
2.5% 25% 50% 75% 97.5%
alpha 0.5884 0.840 1.0341 1.294 2.077
beta -61.8011 -22.047 -0.5133 21.207 61.525
k 0.4775 0.771 0.9668 1.197 1.708
Although the MCMC converge, my concern is that the value for beta's credible interval is too wide which is contradict with the result shown from this paper (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9184216/).
AN UPDATE: I tried to change the prior distribution for beta as Normal(0,100) and the credible interval return a reasonable estimate as below
1. Empirical mean and standard deviation for each variable,
plus standard error of the mean:
Mean SD Naive SE Time-series SE
alpha 1.111e+00 0.3898 0.0031827 0.003183
beta 2.862e-05 0.1001 0.0008172 0.000813
k 9.966e-01 0.3139 0.0025627 0.002581
2. Quantiles for each variable:
2.5% 25% 50% 75% 97.5%
alpha 0.5857 0.84044 1.035e+00 1.29294 2.1148
beta -0.1986 -0.06703 6.411e-05 0.06833 0.1926
k 0.4782 0.76918 9.651e-01 1.18959 1.7043
How should I know which prior distribution is reliable?
Is there any other method that is simple to code by ourself but can brings a reliable estimate on any proportional hazard model?