I have written a custom environment so I can play around with reinforcement learning (PPO) and tf-agents.
This works fine if I wrap my env ( which inherits from py_environment.PyEnvironment) in a TfPyEnvironment
, but fails if I try to wrap it into a ParallelPyEnvironment
. I have tried playing around with all the keyword arguments of ParallelPyEnvironment
but the code just runs up to the line and then nothing happens - no Exception, the program does not terminate etc.
Here is my code initialising the environment and showing off the working variant for the eval_env
:
train_env = tf_py_environment.TFPyEnvironment(
ParallelPyEnvironment(
[CardGameEnv()] * hparams['parallel_environments']
)
)
# this works perfectly:
eval_env = tf_py_environment.TFPyEnvironment(CardGameEnv(debug=True))
If I terminate the script via CTRL+C
, this is what is being output:
Traceback (most recent call last):
Traceback (most recent call last):
File "E:\Users\tmp\Documents\Programming\Neural Nets\Poker_AI\poker_logic\train.py", line 229, in <module>
File "<string>", line 1, in <module>
train(model_num=3)
File "C:\Python37\lib\multiprocessing\spawn.py", line 105, in spawn_main
File "E:\Users\tmp\Documents\Programming\Neural Nets\Poker_AI\poker_logic\train.py", line 64, in train
[CardGameEnv()] * hparams['parallel_environments']
exitcode = _main(fd)
File "E:\Users\tmp\AppData\Roaming\Python\Python37\site-packages\gin\config.py", line 1009, in wrapper
File "C:\Python37\lib\multiprocessing\spawn.py", line 113, in _main
preparation_data = reduction.pickle.load(from_parent)
KeyboardInterrupt
return fn(*new_args, **new_kwargs)
File "C:\Python37\lib\site-packages\tf_agents\environments\parallel_py_environment.py", line 70, in __init__
self.start()
File "C:\Python37\lib\site-packages\tf_agents\environments\parallel_py_environment.py", line 83, in start
env.start(wait_to_start=self._start_serially)
File "C:\Python37\lib\site-packages\tf_agents\environments\parallel_py_environment.py", line 223, in start
self._process.start()
File "C:\Python37\lib\multiprocessing\process.py", line 112, in start
self._popen = self._Popen(self)
File "C:\Python37\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Python37\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\Python37\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
reduction.dump(process_obj, to_child)
File "C:\Python37\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
File "C:\Python37\lib\site-packages\tf_agents\environments\parallel_py_environment.py", line 264, in __getattr__
return self._receive()
File "C:\Python37\lib\site-packages\tf_agents\environments\parallel_py_environment.py", line 333, in _receive
message, payload = self._conn.recv()
File "C:\Python37\lib\multiprocessing\connection.py", line 250, in recv
buf = self._recv_bytes()
File "C:\Python37\lib\multiprocessing\connection.py", line 306, in _recv_bytes
[ov.event], False, INFINITE)
KeyboardInterrupt
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "C:\Python37\lib\site-packages\tf_agents\environments\parallel_py_environment.py", line 289, in close
self._process.join(5)
File "C:\Python37\lib\multiprocessing\process.py", line 139, in join
assert self._popen is not None, 'can only join a started process'
AssertionError: can only join a started process
From that I conclude that the thread ParallelPyEnvironment
is trying to start does not do that, but since I'm not very experienced with threading in Python, I have no idea where to go from here, especially how to fix this.
Current training takes a long time and does not use my PC's capabilities at all (3GB of 32GB RAM used, processor at 3%, GPU barely working at all but VRAM full), so this should speed up training time significantly.
The solution is to pass in callables, not environments, so the
ParallelPyEnvironment
can construct them itself: