I'd like to define rewardsum()
instance for pettingzoo env wrapper in torchrl.
here is the definition of my env:
from torchrl.envs.libs.pettingzoo import PettingZooEnv
from torchrl.envs.utils import MarlGroupMapType
env = PettingZooEnv(
task="mpe/simple_spread_v3",
parallel=False,
use_mask=True, # Must use it since one player plays at a time
group_map=None # # Use default for AEC (one group per player)
)
what I am trying to do:
env = TransformedEnv(
env,
RewardSum(),
)
check_env_specs(env)
here is the error i get (split into multiple lines):
ValueError: Could not match the env reset_keys ['_reset'] with
the in_keys [('agent_0', 'reward'), ('agent_1', 'reward'),
('agent_2', 'reward')].
Please make sure that these have the same length.
I am expecting the sum of the rewards of all agents in my env to be accessible via ["next", "episode_reward"]
which are default access keys defined by torchrl.
Basically the issue is that the environment has a global done state but there is a different reward key for each group.
RewardSum(reset_keys=["_reset"] * len(env.group_map.keys())
should work