Biasing MCTS with Features for General Games

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingAcademicpeer-review

16 Downloads (Pure)

Abstract

This paper proposes using a linear function approximator, rather than a deep neural network (DNN), to bias a Monte Carlo tree search (MCTS) player for general games. This is unlikely to match the potential raw playing strength of DNNs, but has advantages in terms of generality, interpretability and resources (time and hardware) required for training. Features describing local patterns are used as inputs. The features are formulated in such a way that they are easily interpretable and applicable to a wide range of general games, and might encode simple local strategies. We gradually create new features during the same self-play training process used to learn feature weights. We evaluate the playing strength of an MCTS player biased by learnt features against a standard upper confidence bounds for trees (UCT) player in multiple different board games, and demonstrate significantly improved playing strength in the majority of them after a small number of self-play training games.
Original languageEnglish
Title of host publicationIEEE Congress on Evolutionary Computation
Subtitle of host publication(CEC'19)
Publication statusPublished - 11 Jun 2019

Cite this

Soemers, D., Piette, E., & Browne, C. (2019). Biasing MCTS with Features for General Games. In IEEE Congress on Evolutionary Computation: (CEC'19)
Soemers, Dennis ; Piette, Eric ; Browne, Cameron. / Biasing MCTS with Features for General Games. IEEE Congress on Evolutionary Computation: (CEC'19). 2019.
@inproceedings{2f7e20afdd5b436a9b40b2eff181e1e0,
title = "Biasing MCTS with Features for General Games",
abstract = "This paper proposes using a linear function approximator, rather than a deep neural network (DNN), to bias a Monte Carlo tree search (MCTS) player for general games. This is unlikely to match the potential raw playing strength of DNNs, but has advantages in terms of generality, interpretability and resources (time and hardware) required for training. Features describing local patterns are used as inputs. The features are formulated in such a way that they are easily interpretable and applicable to a wide range of general games, and might encode simple local strategies. We gradually create new features during the same self-play training process used to learn feature weights. We evaluate the playing strength of an MCTS player biased by learnt features against a standard upper confidence bounds for trees (UCT) player in multiple different board games, and demonstrate significantly improved playing strength in the majority of them after a small number of self-play training games.",
author = "Dennis Soemers and Eric Piette and Cameron Browne",
year = "2019",
month = "6",
day = "11",
language = "English",
booktitle = "IEEE Congress on Evolutionary Computation",

}

Soemers, D, Piette, E & Browne, C 2019, Biasing MCTS with Features for General Games. in IEEE Congress on Evolutionary Computation: (CEC'19).

Biasing MCTS with Features for General Games. / Soemers, Dennis; Piette, Eric; Browne, Cameron.

IEEE Congress on Evolutionary Computation: (CEC'19). 2019.

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingAcademicpeer-review

TY - GEN

T1 - Biasing MCTS with Features for General Games

AU - Soemers, Dennis

AU - Piette, Eric

AU - Browne, Cameron

PY - 2019/6/11

Y1 - 2019/6/11

N2 - This paper proposes using a linear function approximator, rather than a deep neural network (DNN), to bias a Monte Carlo tree search (MCTS) player for general games. This is unlikely to match the potential raw playing strength of DNNs, but has advantages in terms of generality, interpretability and resources (time and hardware) required for training. Features describing local patterns are used as inputs. The features are formulated in such a way that they are easily interpretable and applicable to a wide range of general games, and might encode simple local strategies. We gradually create new features during the same self-play training process used to learn feature weights. We evaluate the playing strength of an MCTS player biased by learnt features against a standard upper confidence bounds for trees (UCT) player in multiple different board games, and demonstrate significantly improved playing strength in the majority of them after a small number of self-play training games.

AB - This paper proposes using a linear function approximator, rather than a deep neural network (DNN), to bias a Monte Carlo tree search (MCTS) player for general games. This is unlikely to match the potential raw playing strength of DNNs, but has advantages in terms of generality, interpretability and resources (time and hardware) required for training. Features describing local patterns are used as inputs. The features are formulated in such a way that they are easily interpretable and applicable to a wide range of general games, and might encode simple local strategies. We gradually create new features during the same self-play training process used to learn feature weights. We evaluate the playing strength of an MCTS player biased by learnt features against a standard upper confidence bounds for trees (UCT) player in multiple different board games, and demonstrate significantly improved playing strength in the majority of them after a small number of self-play training games.

M3 - Conference article in proceeding

BT - IEEE Congress on Evolutionary Computation

ER -

Soemers D, Piette E, Browne C. Biasing MCTS with Features for General Games. In IEEE Congress on Evolutionary Computation: (CEC'19). 2019