Online Monte Carlo Counterfactual Regret Minimization for Search in Imperfect Information Games

Viliam Lisy*, Marc Lanctot, Michael Bowling

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingAcademicpeer-review

Abstract

Online search in games has been a core interest of artificial intelligence. Search in imperfect information games (e.g., Poker, Bridge, Skat) is particularly challenging due to the complexities introduced by hidden information. In this paper, we present Online Outcome Sampling, an online search variant of Monte Carlo Counterfactual Regret Minimization, which preserves its convergence to Nash equilibrium. We show that OOS can overcome the problem of non-locality encountered by previous search algorithms and perform well against its worst-case opponents. We show that exploitability of the strategies played by OOS decreases as the amount of search time increases, and that preexisting Information Set Monte Carlo tree search (ISMCTS) can get more exploitable over time. In head-to-head play, OOS outperforms ISMCTS in games where non-locality plays a significant role, given a sufficient computation time per move.
Original languageEnglish
Title of host publicationProceedings of the 14th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2015)
Number of pages10
Publication statusPublished - 2015
Event14th International Conference on Autonomous Agents and Multiagent Systems - Istanbul, Turkey
Duration: 4 May 20158 May 2015
Conference number: 14
http://www.ifaamas.org/AAMAS/aamas2015/

Conference

Conference14th International Conference on Autonomous Agents and Multiagent Systems
Abbreviated titleAAMAS 2015
Country/TerritoryTurkey
CityIstanbul
Period4/05/158/05/15
Internet address

Cite this