Model-based reinforcement learning for alternating Markov games

被引：0

作者：

Mellor, D ^{[1
]}

机构：

[1] Univ Newcastle, Sch Elect Engn & Comp Sci, Callaghan, NSW 2308, Australia

来源：

AI 2003: ADVANCES IN ARTIFICIAL INTELLIGENCE | 2003年 / 2903卷

关键词：

game playing; machine learning; planning; reinforcement learning; search;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Online training is a promising technique for training reinforcement learning agents to play strategy board games over the internet against human opponents. But the limited training experience that can be generated by playing against real humans online means that learning must be data-efficient. Data-efficiency has been achieved in other domains by augmenting reinforcement learning with a model: model-based reinforcement learning. In this paper the Minimax-MBTD algorithm is presented, which extends model-based reinforcement learning to deterministic alternating Markov games, a generalisation of two-player zero-sum strategy board games like chess and Go. By using a minimax measure of optimality the strategy learnt generalises to arbitrary opponents, unlike approaches that explicitly model specific opponents. Minimax-MBTD is applied to Tic-Tac-Toe and found to converge faster than direct reinforcement learning, but focussing planning on successors to the current state resulted in slower convergence than unfocussed random planning.

引用

页码：520 / 531

页数：12

共 50 条

[21] Model-based Bayesian Reinforcement Learning for Dialogue Management
Lison, Pierre
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 475 - 479
[22] Learnable Weighting Mechanism in Model-based Reinforcement Learning
Huang W.-Z.
Yin Q.-Y.
Zhang J.-G.
Huang K.-Q.
Ruan Jian Xue Bao/Journal of Software, 2023, 34 (06): : 2765 - 2775
[23] Exploration in Relational Domains for Model-based Reinforcement Learning
Lang, Tobias
Toussaint, Marc
Kersting, Kristian
JOURNAL OF MACHINE LEARNING RESEARCH, 2012, 13 : 3725 - 3768
[24] A Model-based Factored Bayesian Reinforcement Learning Approach
Wu, Bo
Feng, Yanpeng
Zheng, Hongyan
APPLIED SCIENCE, MATERIALS SCIENCE AND INFORMATION TECHNOLOGIES IN INDUSTRY, 2014, 513-517 : 1092 - 1095
[25] Free Will Belief as a Consequence of Model-Based Reinforcement Learning
Rehn, Erik M.
ARTIFICIAL GENERAL INTELLIGENCE, AGI 2022, 2023, 13539 : 353 - 363
[26] Offline Model-Based Reinforcement Learning for Tokamak Control
Char, Ian
Abbate, Joseph
Bardoczi, Laszlo
Boyer, Mark D.
Chung, Youngseog
Conlin, Rory
Erickson, Keith
Mehta, Viraj
Richner, Nathan
Kolemen, Egemen
Schneider, Jeff
LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211
[27] Model-Based Off-Policy Deep Reinforcement Learning With Model-Embedding
Tan, Xiaoyu
Qu, Chao
Xiong, Junwu
Zhang, James
Qiu, Xihe
Jin, Yaochu
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, 8 (04): : 2974 - 2986
[28] Model-Based Reinforcement Learning for Humanoids: A Study on Forming Rewards with the iCub platform
Fachantidis, Anestis
Di Nuovo, Alessandro
Cangelosi, Angelo
Vlahavas, Ioannis
2013 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE, COGNITIVE ALGORITHMS, MIND, AND BRAIN (CCMB), 2013, : 87 - 93
[29] Uncertainty-Aware Model-Based Offline Reinforcement Learning for Automated Driving
Diehl, Christopher
Sievernich, Timo Sebastian
Kruger, Martin
Hoffmann, Frank
Bertram, Torsten
IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (02) : 1167 - 1174
[30] Case-Based Task Generalization in Model-Based Reinforcement Learning
Zholus, Artem
Panov, Aleksandr, I
ARTIFICIAL GENERAL INTELLIGENCE, AGI 2021, 2022, 13154 : 344 - 354

← 1 2 3 4 5 →