αβ-based play-outs in Monte-Carlo Tree Search

Mark H. M. Winands, Yngvi Björnsson

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingAcademicpeer-review

Abstract

Monte-Carlo Tree Search (MCTS) is a recent paradigm for game-tree search, which gradually builds a game-tree in a best-first fashion based on the results of randomized simulation play-outs. The performance of such an approach is highly dependent on both the total number of simulation play-outs and their quality. The two metrics are, however, typically inversely correlated - improving the quality of the play-outs generally involves adding knowledge that requires extra computation, thus allowing fewer play-outs to be performed per time unit. The general practice in MCTS seems to be more towards using relatively knowledge-light play-out strategies for the benefit of getting additional simulations done. In this paper we show, for the game Lines of Action (LOA), that this is not necessarily the best strategy. The newest version of our simulation-based LOA program, MC-LOAαβ, uses a selective 2-ply αβ-search at each step in its play-outs for choosing a move. Even though this reduces the number of simulations by more than a factor of two, the new version outperforms previous versions by a large margin - achieving a winning score of approximately 60%.

Original languageEnglish
Title of host publication2011 IEEE Conference on Computational Intelligence and Games, CIG 2011, Seoul, South Korea, August 31 - September 3, 2011
EditorsSung-Bae Cho, Simon M. Lucas, Philip Hingston
PublisherIEEE
Pages110-117
Number of pages8
DOIs
Publication statusPublished - 2011

Fingerprint

Dive into the research topics of 'αβ-based play-outs in Monte-Carlo Tree Search'. Together they form a unique fingerprint.

Cite this