How could a minimax algorithm be more optimistic?

619 Views Asked by At

Minimax seems to do a great job of not losing, but it's very fatalistic in assuming the opponent won't make a mistake. Of course a lot of games are solved to a draw, but one should be playing for "Push as hard as possible for a win without risking losing", even when no forced wins are available. That is, given two trees with the same (drawn) end position given optimal play, how could the algorithm be adjusted to prefer the one which is most likely to win if the opponent makes a sub-optimal move, or make the opponent more likely to slip up?

Using the simple example of Tic-Tac-Toe, a stronger player would often aim to set up forks and thereby guarantee wins. Even though the opponent could see such a trick coming and stop it beforehand, they're more likely to miss that than if you just put two Xs in an empty row and hope they momentarily forget what game they're playing. Similarly a strong player would tend to start in the centre or perhaps a corner, but in simple minimax there's no reason (since you can still force a draw) not to pick an edge square.

2

There are 2 best solutions below

1
On

How about tweaking "min" nodes?

In regular minimax, when evaluating a position for the opponent, the score is the minimum score for each of his moves. Injecting some optimism (from the "max" player's pov) into the search could be done by using a different function than the minimum. Some things that could be tried out:

-using the second worst score

-using a mix between the min and the average (or median)

Perhaps this should be tied to an optimism factor that increases with the depth of the node. This would avoid ignoring a very bad move by the opponent lower in the tree (which in most games would mean a more obvious move).

1
On

If I understand your question correctly, you're asking how to adjust the minimax algorithm so that it doesn't assume the opponent always makes the best move.

Look into the expectiminimax algorithm, a variation of minimax. Essentially, instead of dealing with only min or max nodes, it introduces chance nodes which store a probability that the opponent will choose the current move.

To make it even simpler, you could assume the opponent selects each move (node) with equal probability.

In short, when its the opponents turn, instead of returning the minimum score, return the average score of their possible moves.