Gymnasium/Petting Zoo: Getting Tic Tac Toe to show ansi text

89 Views Asked by At

Using the Tic Tac Toe environment:

from pettingzoo.classic import tictactoe_v3

env = tictactoe_v3.env(render_mode="ansi")
env.reset(seed=1)

env.step(1)
print(env.render())

This outputs an empty string '', and also launches an unnecessary/unopenable python window. It properly displays a graphical board in the new window if I specify render_mode="human", and it also prints a long array to terminal if I specify render_mode="rgb_array".

I just want a text output of my tic tac toe board. What am I missing?

1

There are 1 best solutions below

4
On BEST ANSWER

It is not clear to me why the window opens, as I would also expect that with ansi render mode the rendering would be done in the terminal. I guess this is some glitch of the tic-tac-toe implementation. What you seem to be looking for is a representation of the environment state. However this is not supported by all environments according to this documentation - see state(). The documentation is a bit misleading for these rendering methods.

Sure enough, for tic-tac-toe:

> env.state()

NotImplementedError: state() method has not been implemented in the environment tictactoe_v3.

They compute the board state in the code to be able to render it (to a window, unfortunately), but that computation is done internally and is not accessible directly. But it so happens that in this game the state is equal to the observations of both players, who see the whole board after every turn. So you can implement your own state method using this. Here is my version:

import numpy as np
from pettingzoo.classic import tictactoe_v3

env = tictactoe_v3.raw_env(render_mode=None)
env.reset(seed=42)

env.step(0)
env.step(1)
env.step(2)
env.step(3)
env.step(4)


def get_state(env):
    obs = env.observe("player_1")["observation"]
    rvl = obs.ravel()

    arr = np.empty(rvl.shape, int)
    arr[::2] = 1
    arr[1::2] = 2

    rvl *= arr

    grp_x = np.array(rvl[::2]).reshape(3, 3).T
    grp_o = np.array(rvl[1::2]).reshape(3, 3).T
    res = grp_x + grp_o

    dct = {1: "X", 2: "O", 0: " "}
    return np.vectorize(dct.get)(res)


res = get_state(env)
print(res)

env.close()

Result:

[['X' 'O' ' ']
 ['O' 'X' ' ']
 ['X' ' ' ' ']]