Abstract
Upper Confidence bounds applied to Trees (UCT) is the default selection policy in Monte-Carlo Tree Search (MCTS), yet it overlooks the strategic use of ancestral node information. Consequently, UCT approaches each decision level as an independent Multi-Armed Bandit problem, disregarding the results achieved along the path that led to the current state. Consequently, it treats decisions as separate in the tree, without integrating the historical context of previous choices. This paper introduces an enhancement to UCT for two-player, deterministic zero-sum games by integrating insights from α−β pruning-a method that increases minimax search efficiency through selective pruning. We propose a revised selection policy that leverages ancestor node data, mirroring α−β pruning’s principle, to refine sample-based search. Our experiments with this enhanced method reveal performance gains in Breakthrough, Mini Shogi, and GoMoku, highlighting the effectiveness of incorporating ancestor search results into the MCTS selection processes.
Original language | English |
---|---|
Title of host publication | 2024 IEEE Conference on Games (CoG) |
Publisher | IEEE |
Pages | 1-4 |
ISBN (Electronic) | 979-8-3503-5067-8 |
ISBN (Print) | 979-8-3503-5068-5 |
DOIs | |
Publication status | Published - 5 Aug 2024 |
Event | 2024 IEEE Conference on Games - Milan, Italy Duration: 5 Aug 2024 → 8 Aug 2024 https://2024.ieee-cog.org/ |
Conference
Conference | 2024 IEEE Conference on Games |
---|---|
Abbreviated title | IEEE CoG 2024 |
Country/Territory | Italy |
City | Milan |
Period | 5/08/24 → 8/08/24 |
Internet address |