Model-based reinforcement learning for alternating Markov games

被引：0

作者：

Mellor, D ^{[1
]}

机构：

[1] Univ Newcastle, Sch Elect Engn & Comp Sci, Callaghan, NSW 2308, Australia

来源：

AI 2003: ADVANCES IN ARTIFICIAL INTELLIGENCE | 2003年 / 2903卷

关键词：

game playing; machine learning; planning; reinforcement learning; search;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Online training is a promising technique for training reinforcement learning agents to play strategy board games over the internet against human opponents. But the limited training experience that can be generated by playing against real humans online means that learning must be data-efficient. Data-efficiency has been achieved in other domains by augmenting reinforcement learning with a model: model-based reinforcement learning. In this paper the Minimax-MBTD algorithm is presented, which extends model-based reinforcement learning to deterministic alternating Markov games, a generalisation of two-player zero-sum strategy board games like chess and Go. By using a minimax measure of optimality the strategy learnt generalises to arbitrary opponents, unlike approaches that explicitly model specific opponents. Minimax-MBTD is applied to Tic-Tac-Toe and found to converge faster than direct reinforcement learning, but focussing planning on successors to the current state resulted in slower convergence than unfocussed random planning.

引用

页码：520 / 531

页数：12

共 50 条

[41] Generating Fingerings for Piano Music with Model-Based Reinforcement Learning
Gao, Wanxiang
Zhang, Sheng
Zhang, Nanxi
Xiong, Xiaowu
Shi, Zhaojun
Sun, Ka
APPLIED SCIENCES-BASEL, 2023, 13 (20):
[42] Model-Based Reinforcement Learning via Proximal Policy Optimization
Sun, Yuewen
Yuan, Xin
Liu, Wenzhang
Sun, Changyin
2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 4736 - 4740
[43] TEAMSTER: Model-based reinforcement learning for ad hoc teamwork
Ribeiro, Joao G.
Rodrigues, Goncalo
Sardinha, Alberto
Melo, Francisco S.
ARTIFICIAL INTELLIGENCE, 2023, 324
[44] Advances in model-based reinforcement learning for Adaptive Optics control
Nousiainen, Jalo
Engler, Byron
Kasper, Markus
Helin, Tapio
Heritier, Cedric T.
Rajani, Chang
ADAPTIVE OPTICS SYSTEMS VIII, 2022, 12185
[45] Inventory management of new products in retailers using model-based deep reinforcement learning
Demizu, Tsukasa
Fukazawa, Yusuke
Morita, Hiroshi
EXPERT SYSTEMS WITH APPLICATIONS, 2023, 229
[46] Importance-Weighted Variational Inference Model Estimation for Offline Bayesian Model-Based Reinforcement Learning
Hishinuma, Toru
Senda, Kei
IEEE ACCESS, 2023, 11 : 145579 - 145590
[47] EEG-based classification of learning strategies : model-based and model-free reinforcement learning
Kim, Dongjae
Weston, Charles
Lee, Sang Wan
2018 6TH INTERNATIONAL CONFERENCE ON BRAIN-COMPUTER INTERFACE (BCI), 2018, : 146 - 148
[48] PAC Reinforcement Learning Algorithm for General-Sum Markov Games
Zehfroosh, Ashkan
Tanner, Herbert G.
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (05) : 2821 - 2831
[49] Solving Cyber Alert Allocation Markov Games with Deep Reinforcement Learning
Dunstatter, Noah
Tahsini, Alireza
Guirguis, Mina
Tesic, Jelena
DECISION AND GAME THEORY FOR SECURITY, 2019, 11836 : 164 - 183
[50] Model-Based and Model-Free Replay Mechanisms for Reinforcement Learning in Neurorobotics
Massi, Elisa
Barthelemy, Jeanne
Mailly, Juliane
Dromnelle, Remi
Canitrot, Julien
Poniatowski, Esther
Girard, Benoit
Khamassi, Mehdi
FRONTIERS IN NEUROROBOTICS, 2022, 16

← 1 2 3 4 5 →