I'm working with the pscl
package in R and trying to get it to produce testable/reproducible results. I've taken a look at the underlying C code and it appears as though GetRNGstate()
and PutRNGstate()
are being called in the right places but it seems impossible to repeat output from the MCMC model.
I've packaged up the functions in simulationResult
from the SoDA package so I can verify the start state of each simulation R on the R side.
library(pscl)
library(SoDA)
run1 <- simulationResult(
ideal(s109,
normalize=TRUE,
maxiter = 500,
thin = 10,
burnin = 0),
seed = 42)
run2 <- simulationResult(
ideal(s109,
normalize=TRUE,
maxiter = 500,
thin = 10,
burnin = 0),
seed = 42)
We can verify that the start states are the same at least on the R side:
all.equal(run1@firstState, run2@firstState)
But the output is different:
all.equal(run1@result$xbar, run2@result$xbar)
I can increase the number of iterations but that shouldn't really matter if the RNG state is getting propogated. Am I missing something really simple? Thanks.
Edit: I should also note that all.equal(run1@lastState, run2@lastState)
(the end state of each run) should be the same but they end up different. My guess is that some source of contingency outside of the R RNG functions called by C is impacting the number of times those RNG functions are called. Curious.
Edit2
I should also add I'm on R 3.0.1 with pscl 1.04.4 on OS X 10.8.4.
As suspected by OP and @SchaunW, the problem lies in the C code. "A bit" of digging revealed a quite subtle problem (see the source code, not the newest version though):
All the sampling in ideal.c appears in the part of commence iterations, i.e. where functions
updatex
,updatey
and others are used. However, the problem is with one of the arguments of these functions - the matrixok
(ironic, right?). It is used byupdatex
andupdateb
and only the positions whereok == 1
are important (incrosscheck
,crosscheckx
).Before that, some values of
ok
are assigned to be 1 incheck(y,ok,n,m)
.However, at the very beginning the initial values of
ok
are denoted bywhich allocates a integer matrix (see util.c for
imatrix
). The problem is that thenok
contains various numbers, i.e. not only zeros and sometimes ones. It seems that they are not related to RNG state of R, which explains the behaviour noted by @SchaunW:all.equal(run1@result$xbar, run2@result$xbar)
returnsTRUE
if!any(ok == 1)
and vice versa. Also, different number of ones explains differentlastState
.I am not an expert in C and I am not sure whether there is a logic error in the code or if the
imatrix
function should be corrected, but a straightforward fix could be to fillok
with zeros right after initialisation:Finally, there is also a fix that does not include modifying the C code (it might not be right for your applications though). Functions
crossxyi
,crossxyj
are used instead ofcrosscheck
,crosscheckx
(the bad ones) whenimpute = TRUE
forideal
.