Monte Carlo simulation - implementing the uct select function

75 Views Asked by At

I'm trying to implement Monte Carlo simulation for card game with following rules:

  • 2 players. every player gets same set of 5 different cards. every card can be marked at the back side with circle, star or square. each card has 3 numbers blue, red, green (eg. one card can be blue=10, red=11, green=23).
  • player that starts (after coin-flip) plays one card and says color. if he played blue, opponent has to respond with green. If he played red opponent responds with red. And if he played green, opponent responds with blue.
  • opponent can see only that it is one of circle, star or square cards that was played and which color was played, but not the number.
  • round ends by comparing the values assigned to colors for those specific cards. One with larger value wins. If same value - draw. Whoever wins, plays the next round. If draw, same player that played before plays again.
  • game ends after one of the players has score that is not reachable by the other or no more cards.

I implemented the algorithm, but it strangely selects some moves that do not seem to be the best. I have a feeling that I need to adjust UCT selection function. Somehow to add lower priority to the moves where other player takes over the move.

Any ideas if there is some better algorithm to use here. Maybe min-max with some optimizations would be better?

0

There are 0 best solutions below