Incentive Learning in Monte Carlo Tree Search

被引:3
作者
Kao, Kuo-Yuan [1 ]
Wu, I-Chen [2 ]
Yen, Shi-Jim [3 ]
Shan, Yi-Chang [2 ]
机构
[1] Natl Penghu Univ, Dept Informat Management, Magong City 880, Taiwan
[2] Natl Chiao Tung Univ, Dept Comp Sci, Hsinchu 30050, Taiwan
[3] Natl Dong Hwa Univ, Dept Comp Sci & Informat Engn, Hualien 974, Taiwan
基金
美国国家科学基金会;
关键词
Artificial intelligence; combinatorial games; computational intelligence; computer games; reinforcement learning;
D O I
10.1109/TCIAIG.2013.2248086
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Monte Carlo tree search (MCTS) is a search paradigm that has been remarkably successful in computer games like Go. It uses Monte Carlo simulation to evaluate the values of nodes in a search tree. The node values are then used to select the actions during subsequent simulations. The performance of MCTS heavily depends on the quality of its default policy, which guides the simulations beyond the search tree. In this paper, we propose an MCTS improvement, called incentive learning, which learns the default policy online. This new default policy learning scheme is based on ideas from combinatorial game theory, and hence is particularly useful when the underlying game is a sum of games. To illustrate the efficiency of incentive learning, we describe a game named Heap-Go and present experimental results on the game.
引用
收藏
页码:346 / 352
页数:7
相关论文
共 50 条
[31]   A Monte Carlo Tree Search approach to QAOA: finding a needle in the haystack [J].
Agirre, Andoni ;
van Nieuwenburg, Evert ;
Wauters, Matteo M. .
NEW JOURNAL OF PHYSICS, 2025, 27 (04)
[32]   Hedging of financial derivative contracts via Monte Carlo tree search [J].
Szehr, Oleg .
JOURNAL OF COMPUTATIONAL FINANCE, 2023, 27 (02) :47-80
[33]   De Novo Drug Design Using Transformer-Based Machine Translation and Reinforcement Learning of an Adaptive Monte Carlo Tree Search [J].
Ang, Dony ;
Rakovski, Cyril ;
Atamian, Hagop S. .
PHARMACEUTICALS, 2024, 17 (02)
[34]   MapZero: Mapping for Coarse-grained Reconfigurable Architectures with Reinforcement Learning and Monte-Carlo Tree Search [J].
Kong, Xiangyu ;
Huang, Yi ;
Zhu, Jianfeng ;
Man, Xingchen ;
Liu, Yang ;
Feng, Chunyang ;
Gou, Pengfei ;
Tang, Minggui ;
Wei, Shaojun ;
Liu, Leibo .
PROCEEDINGS OF THE 2023 THE 50TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, ISCA 2023, 2023, :646-659
[35]   Beyond Trial and Error: Lane Keeping with Monte Carlo Tree Search-Driven Optimization of Reinforcement Learning [J].
Kovari, Balint ;
Pelenczei, Balint ;
Knab, Istvan Gellert ;
Becsi, Tamas .
ELECTRONICS, 2024, 13 (11)
[36]   Wind farm layout optimization using adaptive evolutionary algorithm with Monte Carlo Tree Search reinforcement learning [J].
Bai, Fangyun ;
Ju, Xinglong ;
Wang, Shouyi ;
Zhou, Wenyong ;
Liu, Feng .
ENERGY CONVERSION AND MANAGEMENT, 2022, 252
[37]   Learning to traverse over graphs with a Monte Carlo tree search-based self-play framework [J].
Wang, Qi ;
Hao, Yongsheng ;
Cao, Jie .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2021, 105
[38]   Efficient graph neural architecture search using Monte Carlo Tree search and prediction network [J].
Deng, TianJin ;
Wu, Jia .
EXPERT SYSTEMS WITH APPLICATIONS, 2023, 213
[39]   An analysis of Single-Player Monte Carlo Tree Search performance in Sokoban [J].
Crippa, Mattia ;
Lanzi, Pier Luca ;
Marocchi, Fabio .
EXPERT SYSTEMS WITH APPLICATIONS, 2022, 192
[40]   Beyond games: a systematic review of neural Monte Carlo tree search applications [J].
Kemmerling, Marco ;
Luetticke, Daniel ;
Schmitt, Robert H. .
APPLIED INTELLIGENCE, 2024, 54 (01) :1020-1046