how to use the same python environment across a compute pool

24 Views Asked by At

I have a compute pool of ~20 servers. I run different types numerical simulations where I have hundreds of simulations running simultaneously. I create a new, unique python environment for each project for reproducibility/archiving purposes.

The process of moving (and updating if needed) the python environment across the compute pool is a pain (I've used conda pack in the past) Recently, I tried putting my python environment on the network where the whole compute pool has access to it and then just pointing the code to the environment using a batch file that runs everything:

set python="X:\99\99\9999\\python_env\my_env\python.exe"
%python% something.py
%python% something_else.py

This seems to generally work, but I get occasional/sporadic fails/errors. The program I use to manage the parallel runs (PESTPP) can deal with errors and will retry any failed runs. They always work the second time, but this adds run time obviously because the whole simulation needs to be ran again. The fact that the runs work the second time and work when I use a local python environment suggests that the fails are related to overloading of the python environment?

Is there something I can do to stop these errors when using a shared python environment, or should I just go back to shuttling environments around using conda-pack?

0

There are 0 best solutions below