Emcee is getting stuck while while running in multiprocessing mode

183 Views Asked by At

I have written an mcmc code using emcee python

### MCMC Parameters
# initial guesses for the free parameters
initial = np.float64([1.e-13, -2]) 

# set no of walkers
nwalkers = 100

# set no of steps
nsteps = 3000

# defining the model
def model(theta, ..):
    # here I am calling 1 external codes using subprocess

    # writes an input file depending on the freeparameters
    write_numdens(theta, rrr)
    
    # calling the first external code 
    line = "radmc3d image imolspec 1 iline 1 incl {} posang {} widthkms {} vkms {} nodust nostar".format(inc,PA,6,6.41)
    subprocess.run([line],shell=True)

    # output of the radmc3d is being used by vis_sample
    model_vis = vis_sample(imagefile='image.out', uvfile=uvfile, src_distance=dist, verbose=True)
    
    # returns 2d array with complex values
    return model_vis 


def lnlike(theta,rrr,y_obs):
    y_model = model(theta, rrr)
    LnLike = -0.5*np.sum(weights*(((y_obs.real-y_model.real))**2 + ((y_obs.imag-y_model.imag))**2))
    return LnLike

### prior
def lnprior(theta):
    X_0, alpha = theta
    # make sure priors are reasonable for source
    if 1e-20 < X_0 < 1e-8 and -3. < alpha < 2.:
        return 0.0
    return -np.inf


### log posterior prob
def lnprob(theta, rrr, y_obs):
    lp = lnprior(theta)
    if not np.isfinite(lp):
        return -np.inf
    return lp + lnlike(theta, rrr, y_obs) #recall if lp not -inf, its 0, so this just returns likelihood


### The MCMC run
#  A list of extra positional arguments which will be called for lnprob along with theta
data = (rrr, y_obs)
# no of parameters
ndim = len(initial)
# this is a multivariate Gaussian centered on each theta, with a small sigma.
p0 = [np.array(initial) + 1e-3 * np.random.randn(ndim) for i in range(nwalkers)]


def main(p0,nwalkers,nsteps,ndim,lnprob,data):
    with multiprocessing.Pool(nthreads) as pool:
        sampler = emcee.EnsembleSampler(nwalkers, ndim, lnprob, args=data)

        print("Running burn-in...")
        p0, _, _ = sampler.run_mcmc(p0, 100, progress=True)
        sampler.reset()

        print("Running production...")
        pos, prob, state = sampler.run_mcmc(p0, nsteps, progress=True)

    return sampler, pos, prob, state

sampler, pos, prob, state = main(p0, nwalkers, nsteps, ndim, lnprob, data)

This code works fine when running without parallelisation. When I run the code with parallelization, the code is getting stuck at vis_sample? I am not able to figure out Why? Here model of the code is accessing two external codes radmc3d and vis_sample. Both the external codes are very extensive. The output of radmc3d is being used by vis_sample. radmc3d has run perfectly fine and produce the required output. vis_sample starts running and get stuck.

I have tried using emcee.interruptible_pool. But it did not change anything. Is there any other way I can solve this issue?

0

There are 0 best solutions below