Monte carlo tree search with heuristic evaluations using implicit minimax backups

Marc Lanctot; Mark H M Winands; Tom Pepels; Nathan R. Sturtevant

doi:10.1109/CIG.2014.6932903

Monte carlo tree search with heuristic evaluations using implicit minimax backups

Marc Lanctot, Mark H M Winands, Tom Pepels, Nathan R. Sturtevant

Research output: Chapter in Book/Report/Conference proceeding › Conference article in proceeding › Academic › peer-review

Abstract

Monte Carlo Tree Search (MCTS) has improved the performance of game engines in domains such as Go, Hex, and general game playing. MCTS has been shown to outperform classic αβ search in games where good heuristic evaluations are difficult to obtain. In recent years, combining ideas from traditional minimax search in MCTS has been shown to be advantageous in some domains, such as Lines of Action, Amazons, and Breakthrough. In this paper, we propose a new way to use heuristic evaluations to guide the MCTS search by storing the two sources of information, estimated win rates and heuristic evaluations, separately. Rather than using the heuristic evaluations to replace the playouts, our technique backs them up implicitly during the MCTS simulations. These minimax values are then used to guide future simulations. We show that using implicit minimax backups leads to stronger play performance in Kalah, Breakthrough, and Lines of Action.

Original language	English
Title of host publication	IEEE Conference on Computatonal Intelligence and Games, CIG
Publisher	IEEE Computer Society
Pages	341-348
ISBN (Print)	9781479935468
DOIs	https://doi.org/10.1109/CIG.2014.6932903
Publication status	Published - 21 Oct 2014

Access to Document

10.1109/CIG.2014.6932903

http://www.mendeley.com/research/monte-carlo-tree-search-heuristic-evaluations-using-implicit-minimax-backups

Cite this

@inproceedings{fc6e8c61dbab44778e96130528a3c3cd,

title = "Monte carlo tree search with heuristic evaluations using implicit minimax backups",

abstract = "Monte Carlo Tree Search (MCTS) has improved the performance of game engines in domains such as Go, Hex, and general game playing. MCTS has been shown to outperform classic αβ search in games where good heuristic evaluations are difficult to obtain. In recent years, combining ideas from traditional minimax search in MCTS has been shown to be advantageous in some domains, such as Lines of Action, Amazons, and Breakthrough. In this paper, we propose a new way to use heuristic evaluations to guide the MCTS search by storing the two sources of information, estimated win rates and heuristic evaluations, separately. Rather than using the heuristic evaluations to replace the playouts, our technique backs them up implicitly during the MCTS simulations. These minimax values are then used to guide future simulations. We show that using implicit minimax backups leads to stronger play performance in Kalah, Breakthrough, and Lines of Action.",

author = "Marc Lanctot and Winands, {Mark H M} and Tom Pepels and Sturtevant, {Nathan R.}",

year = "2014",

month = oct,

day = "21",

doi = "10.1109/CIG.2014.6932903",

language = "English",

isbn = "9781479935468",

pages = "341--348",

booktitle = "IEEE Conference on Computatonal Intelligence and Games, CIG",

publisher = "IEEE Computer Society",

address = "United States",

}

TY - GEN

T1 - Monte carlo tree search with heuristic evaluations using implicit minimax backups

AU - Lanctot, Marc

AU - Winands, Mark H M

AU - Pepels, Tom

AU - Sturtevant, Nathan R.

PY - 2014/10/21

Y1 - 2014/10/21

N2 - Monte Carlo Tree Search (MCTS) has improved the performance of game engines in domains such as Go, Hex, and general game playing. MCTS has been shown to outperform classic αβ search in games where good heuristic evaluations are difficult to obtain. In recent years, combining ideas from traditional minimax search in MCTS has been shown to be advantageous in some domains, such as Lines of Action, Amazons, and Breakthrough. In this paper, we propose a new way to use heuristic evaluations to guide the MCTS search by storing the two sources of information, estimated win rates and heuristic evaluations, separately. Rather than using the heuristic evaluations to replace the playouts, our technique backs them up implicitly during the MCTS simulations. These minimax values are then used to guide future simulations. We show that using implicit minimax backups leads to stronger play performance in Kalah, Breakthrough, and Lines of Action.

AB - Monte Carlo Tree Search (MCTS) has improved the performance of game engines in domains such as Go, Hex, and general game playing. MCTS has been shown to outperform classic αβ search in games where good heuristic evaluations are difficult to obtain. In recent years, combining ideas from traditional minimax search in MCTS has been shown to be advantageous in some domains, such as Lines of Action, Amazons, and Breakthrough. In this paper, we propose a new way to use heuristic evaluations to guide the MCTS search by storing the two sources of information, estimated win rates and heuristic evaluations, separately. Rather than using the heuristic evaluations to replace the playouts, our technique backs them up implicitly during the MCTS simulations. These minimax values are then used to guide future simulations. We show that using implicit minimax backups leads to stronger play performance in Kalah, Breakthrough, and Lines of Action.

U2 - 10.1109/CIG.2014.6932903

DO - 10.1109/CIG.2014.6932903

M3 - Conference article in proceeding

SN - 9781479935468

SP - 341

EP - 348

BT - IEEE Conference on Computatonal Intelligence and Games, CIG

PB - IEEE Computer Society

ER -