Weird results with unity ml agents python api

Question

Weird results with unity ml agents python api

677 Views Asked by Owen_w At 20 August 2025 at 02:50

I am using the 3DBall example environment, but I am getting some really weird results that I don't understand why they are happening. My code so far is just a for range loop that views the reward and fills in the inputs needed with random values. However when I was doing it, never a negative reward was shown, and randomly there would be no decision steps, which would make sense, but shouldn't it just keep on simulating until there is a decision step? Any help would be greatly appreciated as other then the documentation there are little to no recourses out there for this.

env = UnityEnvironment()
env.reset()
behavior_names = env.behavior_specs

for i in range(50):
    arr = []
    behavior_names = env.behavior_specs
    for i in behavior_names:
        print(i)
    DecisionSteps = env.get_steps("3DBall?team=0")
    print(DecisionSteps[0].reward,len(DecisionSteps[0].reward))
    print(DecisionSteps[0].action_mask) #for some reason it returns action mask as false when Decisionsteps[0].reward is empty and is None when not


    for i in range(len(DecisionSteps[0])):
        arr.append([])
        for b in range(2):
            arr[-1].append(random.uniform(-10,10))
    if(len(DecisionSteps[0])!= 0):
        env.set_actions("3DBall?team=0",numpy.array(arr))
        env.step()
    else:
        env.step()
env.close()

Original Q&A

There are 1 best solutions below

**Marcus** · Accepted Answer

I think that your problem is that when the simulation terminates and needs to be reset, the agent does not return a decision_step but rather a terminal_step. This is because the agent has dropped the ball and the reward returned in the terminal_step will be -1.0. I have taken your code and made some changes and now it runs fine (except that you probably want to change so that you don't reset every time one of the agents drops its ball).

import numpy as np
import mlagents
from mlagents_envs.environment import UnityEnvironment

# -----------------
# This code is used to close an env that might not have been closed before
try:
    unity_env.close()
except:
    pass
# -----------------

env = UnityEnvironment(file_name = None)
env.reset()

for i in range(1000):
    arr = []
    behavior_names = env.behavior_specs

    # Go through all existing behaviors
    for behavior_name in behavior_names:
        decision_steps, terminal_steps = env.get_steps(behavior_name)

        for agent_id_terminated in terminal_steps:
            print("Agent " + behavior_name + " has terminated, resetting environment.")
            # This is probably not the desired behaviour, as the other agents are still active. 
            env.reset()

        actions = []
        for agent_id_decisions in decision_steps:
            actions.append(np.random.uniform(-1,1,2))

        # print(decision_steps[0].reward)
        # print(decision_steps[0].action_mask)

        if len(actions) > 0:
            env.set_actions(behavior_name, np.array(actions))
    try:
        env.step()
    except:
        print("Something happend when taking a step in the environment.")
        print("The communicatior has probably terminated, stopping simulation early.")
        break
env.close()

Weird results with unity ml agents python api

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in UNITY-GAME-ENGINE

Related Questions in ARTIFICIAL-INTELLIGENCE

Related Questions in ML-AGENT

Trending Questions

Popular # Hahtags

Popular Questions