Enhancements for Monte-Carlo Tree Search in Ms Pac-Man

Tom Pepels; Mark H M Winands

doi:10.1109/CIG.2012.6374165

Enhancements for Monte-Carlo Tree Search in Ms Pac-Man

Research output: Chapter in Book/Report/Conference proceeding › Conference article in proceeding › Academic › peer-review

Abstract

In this paper enhancements for the Monte-Carlo Tree Search (MCTS) framework are investigated to play Ms Pac-Man. MCTS is used to find an optimal path for an agent at each turn, determining the move to make based on randomised simulations. Ms Pac-Man is a real-time arcade game, in which the protagonist has several independent goals but no conclusive terminal state. Unlike games such as Chess or Go there is no state in which the player wins the game. Furthermore, the Pac-Man agent has to compete with a range of different ghost agents, hence limited assumptions can be made about the opponent's behaviour. In order to expand the capabilities of existing MCTS agents, five enhancements are discussed: 1) a variable depth tree, 2) playout strategies for the ghost-team and Pac-Man, 3) including long-term goals in scoring, 4) endgame tactics, and 5) a Last-Good-Reply policy for memorising rewarding moves during playouts. An average performance gain of 40,962 points, compared to the average score of the top scoring Pac-Man agent during the CIG'11, is achieved by employing these methods.

Original language	English
Title of host publication	2012 IEEE Conference on Computational Intelligence and Games, CIG 2012
Pages	265-272
Number of pages	8
DOIs	https://doi.org/10.1109/CIG.2012.6374165
Publication status	Published - 2012

Access to Document

10.1109/CIG.2012.6374165

http://www.mendeley.com/research/enhancements-montecarlo-tree-search-ms-pacman

Cite this

@inproceedings{a0d3d0b1fc1e41f9b65a653bc72f6863,

title = "Enhancements for Monte-Carlo Tree Search in Ms Pac-Man",

abstract = "In this paper enhancements for the Monte-Carlo Tree Search (MCTS) framework are investigated to play Ms Pac-Man. MCTS is used to find an optimal path for an agent at each turn, determining the move to make based on randomised simulations. Ms Pac-Man is a real-time arcade game, in which the protagonist has several independent goals but no conclusive terminal state. Unlike games such as Chess or Go there is no state in which the player wins the game. Furthermore, the Pac-Man agent has to compete with a range of different ghost agents, hence limited assumptions can be made about the opponent's behaviour. In order to expand the capabilities of existing MCTS agents, five enhancements are discussed: 1) a variable depth tree, 2) playout strategies for the ghost-team and Pac-Man, 3) including long-term goals in scoring, 4) endgame tactics, and 5) a Last-Good-Reply policy for memorising rewarding moves during playouts. An average performance gain of 40,962 points, compared to the average score of the top scoring Pac-Man agent during the CIG'11, is achieved by employing these methods.",

author = "Tom Pepels and Winands, {Mark H M}",

year = "2012",

doi = "10.1109/CIG.2012.6374165",

language = "English",

isbn = "9781467311922",

pages = "265--272",

booktitle = "2012 IEEE Conference on Computational Intelligence and Games, CIG 2012",

}

TY - GEN

T1 - Enhancements for Monte-Carlo Tree Search in Ms Pac-Man

AU - Pepels, Tom

AU - Winands, Mark H M

PY - 2012

Y1 - 2012

N2 - In this paper enhancements for the Monte-Carlo Tree Search (MCTS) framework are investigated to play Ms Pac-Man. MCTS is used to find an optimal path for an agent at each turn, determining the move to make based on randomised simulations. Ms Pac-Man is a real-time arcade game, in which the protagonist has several independent goals but no conclusive terminal state. Unlike games such as Chess or Go there is no state in which the player wins the game. Furthermore, the Pac-Man agent has to compete with a range of different ghost agents, hence limited assumptions can be made about the opponent's behaviour. In order to expand the capabilities of existing MCTS agents, five enhancements are discussed: 1) a variable depth tree, 2) playout strategies for the ghost-team and Pac-Man, 3) including long-term goals in scoring, 4) endgame tactics, and 5) a Last-Good-Reply policy for memorising rewarding moves during playouts. An average performance gain of 40,962 points, compared to the average score of the top scoring Pac-Man agent during the CIG'11, is achieved by employing these methods.

AB - In this paper enhancements for the Monte-Carlo Tree Search (MCTS) framework are investigated to play Ms Pac-Man. MCTS is used to find an optimal path for an agent at each turn, determining the move to make based on randomised simulations. Ms Pac-Man is a real-time arcade game, in which the protagonist has several independent goals but no conclusive terminal state. Unlike games such as Chess or Go there is no state in which the player wins the game. Furthermore, the Pac-Man agent has to compete with a range of different ghost agents, hence limited assumptions can be made about the opponent's behaviour. In order to expand the capabilities of existing MCTS agents, five enhancements are discussed: 1) a variable depth tree, 2) playout strategies for the ghost-team and Pac-Man, 3) including long-term goals in scoring, 4) endgame tactics, and 5) a Last-Good-Reply policy for memorising rewarding moves during playouts. An average performance gain of 40,962 points, compared to the average score of the top scoring Pac-Man agent during the CIG'11, is achieved by employing these methods.

U2 - 10.1109/CIG.2012.6374165

DO - 10.1109/CIG.2012.6374165

M3 - Conference article in proceeding

SN - 9781467311922

SP - 265

EP - 272

BT - 2012 IEEE Conference on Computational Intelligence and Games, CIG 2012

ER -