Abstract
Monte-Carlo Tree Search (MCTS) is a recent paradigm for game-tree search, which gradually builds a game-tree in a best-first fashion based on the results of randomized simulation play-outs. The performance of such an approach is highly dependent on both the total number of simulation play-outs and their quality. The two metrics are, however, typically inversely correlated - improving the quality of the play-outs generally involves adding knowledge that requires extra computation, thus allowing fewer play-outs to be performed per time unit. The general practice in MCTS seems to be more towards using relatively knowledge-light play-out strategies for the benefit of getting additional simulations done. In this paper we show, for the game Lines of Action (LOA), that this is not necessarily the best strategy. The newest version of our simulation-based LOA program, MC-LOAαβ, uses a selective 2-ply αβ-search at each step in its play-outs for choosing a move. Even though this reduces the number of simulations by more than a factor of two, the new version outperforms previous versions by a large margin - achieving a winning score of approximately 60%.
Original language | English |
---|---|
Title of host publication | 2011 IEEE Conference on Computational Intelligence and Games, CIG 2011, Seoul, South Korea, August 31 - September 3, 2011 |
Editors | Sung-Bae Cho, Simon M. Lucas, Philip Hingston |
Publisher | IEEE |
Pages | 110-117 |
Number of pages | 8 |
DOIs | |
Publication status | Published - 2011 |