Incentive Learning in Monte Carlo Tree Search

被引:3
作者
Kao, Kuo-Yuan [1 ]
Wu, I-Chen [2 ]
Yen, Shi-Jim [3 ]
Shan, Yi-Chang [2 ]
机构
[1] Natl Penghu Univ, Dept Informat Management, Magong City 880, Taiwan
[2] Natl Chiao Tung Univ, Dept Comp Sci, Hsinchu 30050, Taiwan
[3] Natl Dong Hwa Univ, Dept Comp Sci & Informat Engn, Hualien 974, Taiwan
基金
美国国家科学基金会;
关键词
Artificial intelligence; combinatorial games; computational intelligence; computer games; reinforcement learning;
D O I
10.1109/TCIAIG.2013.2248086
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Monte Carlo tree search (MCTS) is a search paradigm that has been remarkably successful in computer games like Go. It uses Monte Carlo simulation to evaluate the values of nodes in a search tree. The node values are then used to select the actions during subsequent simulations. The performance of MCTS heavily depends on the quality of its default policy, which guides the simulations beyond the search tree. In this paper, we propose an MCTS improvement, called incentive learning, which learns the default policy online. This new default policy learning scheme is based on ideas from combinatorial game theory, and hence is particularly useful when the underlying game is a sum of games. To illustrate the efficiency of incentive learning, we describe a game named Heap-Go and present experimental results on the game.
引用
收藏
页码:346 / 352
页数:7
相关论文
共 50 条
[41]   Applying and Improving Monte-Carlo Tree Search in a Fighting Game AI [J].
Ishihara, Makoto ;
Miyazaki, Taichi ;
Chu, Chun Yin ;
Harada, Tomohiro ;
Thawonmas, Ruck .
13TH INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTER ENTERTAINMENT TECHNOLOGY (ACE 2016), 2016,
[42]   Investigations with Monte Carlo Tree Search for Finding Better Multivariate Horner Schemes [J].
van den Herik, H. Jaap ;
Kuipers, Jan ;
Vermaseren, Jos A. M. ;
Plaat, Aske .
AGENTS AND ARTIFICIAL INTELLIGENCE, ICAART 2013, 2014, 449 :3-20
[43]   Beyond games: a systematic review of neural Monte Carlo tree search applications [J].
Marco Kemmerling ;
Daniel Lütticke ;
Robert H. Schmitt .
Applied Intelligence, 2024, 54 :1020-1046
[44]   Predicting and publishing accurate imbalance prices using Monte Carlo Tree Search [J].
Pavirani, Fabio ;
Van Gompel, Jonas ;
Madahi, Seyed Soroush Karimi ;
Claessens, Bert ;
Develder, Chris .
APPLIED ENERGY, 2025, 392
[45]   Optimal Cislunar Architecture Design Using Monte Carlo Tree Search Methods [J].
Michael Klonowski ;
Marcus J. Holzinger ;
Naomi Owens Fahrner .
The Journal of the Astronautical Sciences, 70
[46]   Multi-objective synthesis planning by means of Monte Carlo Tree search [J].
Lai, Helen ;
Kannas, Christos ;
Hassen, Alan Kai ;
Granqvist, Emma ;
Westerlund, Annie M. ;
Clevert, Djork-Arne ;
Preuss, Mike ;
Genheden, Samuel .
ARTIFICIAL INTELLIGENCE IN THE LIFE SCIENCES, 2025, 7
[47]   Optimization of Restricted Container Relocation Using the Monte Carlo Tree Search Method [J].
Chaabane, Abdelali ;
Yachba, Khadidja ;
Bellatreche, Ladjel .
TRANSPORT AND TELECOMMUNICATION JOURNAL, 2025, 26 (01) :13-22
[48]   Bidirectional parallel Monte Carlo tree search gait planning for hexapod robot [J].
Hu, Li-Kun ;
Liu, Heng-Jia ;
Wang, Yi-Fei ;
Xu, Da-Ye ;
Wang, Xiao-Yong .
Kongzhi Lilun Yu Yingyong/Control Theory and Applications, 2024, 41 (12) :2345-2355
[49]   Optimal Cislunar Architecture Design Using Monte Carlo Tree Search Methods [J].
Klonowski, Michael ;
Holzinger, Marcus J. ;
Fahrner, Naomi Owens .
JOURNAL OF THE ASTRONAUTICAL SCIENCES, 2023, 70 (03)
[50]   MCTSteg: A Monte Carlo Tree Search-Based Reinforcement Learning Framework for Universal Non-Additive Steganography [J].
Mo, Xianbo ;
Tan, Shunquan ;
Li, Bin ;
Huang, Jiwu .
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2021, 16 :4306-4320