When should a monto carlo tree search be reset?

97 Views Asked by Tue At 27 July 2025 at 23:59

I am trying to build/understand exactly how the monte carlo tree search algorithm (mcts) works in conjunction with a neural network to learn how to play games (like chess). But I'm having trouble understanding when to reset the tree.

I have looked at https://github.com/suragnair/alpha-zero-general as an example. But one thing that doesn't make sense to me about this implementation is that it resets the tree after each individual game which doesn't seem right to me. (So a new tree is created everytime a new game is made). I thought the idea of mcts was to accumulate knowledge over a lot of games and only reset the tree once you have trained your network to predict new probabilities for each boardstate?

Is this me misunderstanding mcts or is this a bug in that particular implementation?

Original Q&A

There are 1 best solutions below

KarelPeeters On 12 January 2023 at 13:09

I'll assume you're talking about AlphaZero since that's what the repo you linked to does, but MCTS is a more general term.

Keeping old search trees around forever isn't really practical, it would take too much memory and most of them wouldn't end up being used ever again (since games generated later in the training process hopefully look very different from the early ones). The whole point is that we distill the knowledge we gained through tree search into a neural network, which we then use in future games to do an even better tree search.

As a small optimization it still makes makes sense to either keep the tree around for the next move, or to instead temporarily cache neural network outputs. I assume this is what you saw in the repo you linked.

When should a monto carlo tree search be reset?

There are 1 best solutions below

Related Questions in REINFORCEMENT-LEARNING

Related Questions in MONTE-CARLO-TREE-SEARCH

Trending Questions

Popular # Hahtags

Popular Questions