Self-adaptive MCTS for General Video Game Playing

Chiara F. Sironi; Jialin Liu; Diego Perez-Liebana; Raluca D. Gaina; Ivan Bravi; Simon M. Lucas; Mark H. M. Winands

doi:10.1007/978-3-319-77538-8_25

Self-adaptive MCTS for General Video Game Playing

Chiara F. Sironi^*, Jialin Liu, Diego Perez-Liebana, Raluca D. Gaina, Ivan Bravi, Simon M. Lucas, Mark H. M. Winands

^*Corresponding author for this work

Networks and Strategic Optimization

Research output: Chapter in Book/Report/Conference proceeding › Chapter › Academic

99 Downloads (Pure)

Abstract

Monte-carlo tree search (mcts) has shown particular success in general game playing (ggp) and general video game playing (gvgp) and many enhancements and variants have been developed. Recently, an on-line adaptive parameter tuning mechanism for mcts agents has been proposed that almost achieves the same performance as off-line tuning in ggp.in this paper we apply the same approach to gvgp and use the popular general video game ai (gvgai) framework, in which the time allowed to make a decision is only 40 ms. We design three self-adaptive mcts (sa-mcts) agents that optimize on-line the parameters of a standard non-self-adaptive mcts agent of gvgai. The three agents select the parameter values using naïve monte-carlo, an evolutionary algorithm and an n-tuple bandit evolutionary algorithm respectively, and are tested on 20 single-player games of gvgai.the sa-mcts agents achieve more robust results on the tested games. With the same time setting, they perform similarly to the baseline standard mcts agent in the games for which the baseline agent performs well, and significantly improve the win rate in the games for which the baseline agent performs poorly. As validation, we also test the performance of non-self-adaptive mcts instances that use the most sampled parameter settings during the on-line tuning of each of the three sa-mcts agents for each game. Results show that these parameter settings improve the win rate on the games wait for breakfast and escape by 4 times and 150 times, respectively.

Original language	English
Title of host publication	Applications of Evolutionary Computation. EvoApplications 2018
Editors	K. Sim, P. Kaufmann
Publisher	Springer
Pages	358-375
Number of pages	18
Volume	10784
ISBN (Electronic)	978-3-319-77538-8
ISBN (Print)	978-3-319-77537-1
DOIs	https://doi.org/10.1007/978-3-319-77538-8_25
Publication status	Published - 2018

Publication series

Series	Lecture Notes in Computer Science
Volume	10784

Access to Document

10.1007/978-3-319-77538-8_25

Full text Final published version, 950 KBLicence: Taverne

Cite this

@inbook{f1b1f1ddae4b4580b539d95117eaf20e,

title = "Self-adaptive MCTS for General Video Game Playing",

abstract = "Monte-carlo tree search (mcts) has shown particular success in general game playing (ggp) and general video game playing (gvgp) and many enhancements and variants have been developed. Recently, an on-line adaptive parameter tuning mechanism for mcts agents has been proposed that almost achieves the same performance as off-line tuning in ggp.in this paper we apply the same approach to gvgp and use the popular general video game ai (gvgai) framework, in which the time allowed to make a decision is only 40 ms. We design three self-adaptive mcts (sa-mcts) agents that optimize on-line the parameters of a standard non-self-adaptive mcts agent of gvgai. The three agents select the parameter values using na{\"i}ve monte-carlo, an evolutionary algorithm and an n-tuple bandit evolutionary algorithm respectively, and are tested on 20 single-player games of gvgai.the sa-mcts agents achieve more robust results on the tested games. With the same time setting, they perform similarly to the baseline standard mcts agent in the games for which the baseline agent performs well, and significantly improve the win rate in the games for which the baseline agent performs poorly. As validation, we also test the performance of non-self-adaptive mcts instances that use the most sampled parameter settings during the on-line tuning of each of the three sa-mcts agents for each game. Results show that these parameter settings improve the win rate on the games wait for breakfast and escape by 4 times and 150 times, respectively.",

author = "Sironi, {Chiara F.} and Jialin Liu and Diego Perez-Liebana and Gaina, {Raluca D.} and Ivan Bravi and Lucas, {Simon M.} and Winands, {Mark H. M.}",

year = "2018",

doi = "10.1007/978-3-319-77538-8_25",

language = "English",

isbn = "978-3-319-77537-1",

volume = "10784",

series = "Lecture Notes in Computer Science",

publisher = "Springer",

pages = "358--375",

editor = "K. Sim and P. Kaufmann",

booktitle = "Applications of Evolutionary Computation. EvoApplications 2018",

address = "United States",

}

Self-adaptive MCTS for General Video Game Playing. / Sironi, Chiara F.; Liu, Jialin; Perez-Liebana, Diego et al.
Applications of Evolutionary Computation. EvoApplications 2018. ed. / K. Sim; P. Kaufmann. Vol. 10784 Springer, 2018. p. 358-375 (Lecture Notes in Computer Science, Vol. 10784).

Research output: Chapter in Book/Report/Conference proceeding › Chapter › Academic

TY - CHAP

T1 - Self-adaptive MCTS for General Video Game Playing

AU - Sironi, Chiara F.

AU - Liu, Jialin

AU - Perez-Liebana, Diego

AU - Gaina, Raluca D.

AU - Bravi, Ivan

AU - Lucas, Simon M.

AU - Winands, Mark H. M.

PY - 2018

Y1 - 2018

N2 - Monte-carlo tree search (mcts) has shown particular success in general game playing (ggp) and general video game playing (gvgp) and many enhancements and variants have been developed. Recently, an on-line adaptive parameter tuning mechanism for mcts agents has been proposed that almost achieves the same performance as off-line tuning in ggp.in this paper we apply the same approach to gvgp and use the popular general video game ai (gvgai) framework, in which the time allowed to make a decision is only 40 ms. We design three self-adaptive mcts (sa-mcts) agents that optimize on-line the parameters of a standard non-self-adaptive mcts agent of gvgai. The three agents select the parameter values using naïve monte-carlo, an evolutionary algorithm and an n-tuple bandit evolutionary algorithm respectively, and are tested on 20 single-player games of gvgai.the sa-mcts agents achieve more robust results on the tested games. With the same time setting, they perform similarly to the baseline standard mcts agent in the games for which the baseline agent performs well, and significantly improve the win rate in the games for which the baseline agent performs poorly. As validation, we also test the performance of non-self-adaptive mcts instances that use the most sampled parameter settings during the on-line tuning of each of the three sa-mcts agents for each game. Results show that these parameter settings improve the win rate on the games wait for breakfast and escape by 4 times and 150 times, respectively.

AB - Monte-carlo tree search (mcts) has shown particular success in general game playing (ggp) and general video game playing (gvgp) and many enhancements and variants have been developed. Recently, an on-line adaptive parameter tuning mechanism for mcts agents has been proposed that almost achieves the same performance as off-line tuning in ggp.in this paper we apply the same approach to gvgp and use the popular general video game ai (gvgai) framework, in which the time allowed to make a decision is only 40 ms. We design three self-adaptive mcts (sa-mcts) agents that optimize on-line the parameters of a standard non-self-adaptive mcts agent of gvgai. The three agents select the parameter values using naïve monte-carlo, an evolutionary algorithm and an n-tuple bandit evolutionary algorithm respectively, and are tested on 20 single-player games of gvgai.the sa-mcts agents achieve more robust results on the tested games. With the same time setting, they perform similarly to the baseline standard mcts agent in the games for which the baseline agent performs well, and significantly improve the win rate in the games for which the baseline agent performs poorly. As validation, we also test the performance of non-self-adaptive mcts instances that use the most sampled parameter settings during the on-line tuning of each of the three sa-mcts agents for each game. Results show that these parameter settings improve the win rate on the games wait for breakfast and escape by 4 times and 150 times, respectively.

U2 - 10.1007/978-3-319-77538-8_25

DO - 10.1007/978-3-319-77538-8_25

M3 - Chapter

SN - 978-3-319-77537-1

VL - 10784

T3 - Lecture Notes in Computer Science

SP - 358

EP - 375

BT - Applications of Evolutionary Computation. EvoApplications 2018

A2 - Sim, K.

A2 - Kaufmann, P.

PB - Springer

ER -