How to properly paralleize a blackbox likelihood in emcee

354 Views Asked by At

I am currently using the emcee package to get MCMC samples using the tutorial https://emcee.readthedocs.io/en/stable/tutorials/parallel/. The serial version of the code works but is slow so I want to use the parallelizing technique to speed up the process.

My code currently looks like as follows (providing the pseudocode):

def loglike(theta):
    cosmo = blackbox_function_from_another_package(theta)
    
    model = calculate_model(cosmo)
    
    diff = data - model
    
    return -0.5 * (diff).T @ ivar @ diff

def mcmc_parallel(loglikelihood, init_pos, nsamples):

    nwalkers = init_pos.shape[0]
    ndim = init_pos.shape[1]
    
    with Pool() as pool:
        sampler = emcee.EnsembleSampler(
            nwalkers, ndim, loglikelihood, pool=pool)
        sampler.run_mcmc(p0, nsamples, progress=True);

#Main code

from multiprocessing import Pool
sampler = mcmc_parallel(loglikelihood=loglike, init_pos=p0, nsamples=5)

data and ivar are global variables for pickling purposes as described in the tutorial. Whenever I try to run this code, it executes indefinitely, and when I interrupt the execution, I get a callback to the following:

    294         try:    # restore state no matter what (e.g., KeyboardInterrupt)
    295             if timeout is None:
--> 296                 waiter.acquire()
    297                 gotit = True
    298             else:

I am not entirely sure what is happening and why the code is freezing. If anyone can help me with this, I would greatly apprecaite it.

1

There are 1 best solutions below

0
On

I had the same problem and found that this solution works for me:

import multiprocessing as mp

Pool = mp.get_context('fork').Pool