Model-based reinforcement learning for alternating Markov games

被引:0
|
作者
Mellor, D [1 ]
机构
[1] Univ Newcastle, Sch Elect Engn & Comp Sci, Callaghan, NSW 2308, Australia
来源
AI 2003: ADVANCES IN ARTIFICIAL INTELLIGENCE | 2003年 / 2903卷
关键词
game playing; machine learning; planning; reinforcement learning; search;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Online training is a promising technique for training reinforcement learning agents to play strategy board games over the internet against human opponents. But the limited training experience that can be generated by playing against real humans online means that learning must be data-efficient. Data-efficiency has been achieved in other domains by augmenting reinforcement learning with a model: model-based reinforcement learning. In this paper the Minimax-MBTD algorithm is presented, which extends model-based reinforcement learning to deterministic alternating Markov games, a generalisation of two-player zero-sum strategy board games like chess and Go. By using a minimax measure of optimality the strategy learnt generalises to arbitrary opponents, unlike approaches that explicitly model specific opponents. Minimax-MBTD is applied to Tic-Tac-Toe and found to converge faster than direct reinforcement learning, but focussing planning on successors to the current state resulted in slower convergence than unfocussed random planning.
引用
收藏
页码:520 / 531
页数:12
相关论文
共 50 条
  • [21] Model-based Bayesian Reinforcement Learning for Dialogue Management
    Lison, Pierre
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 475 - 479
  • [22] Learnable Weighting Mechanism in Model-based Reinforcement Learning
    Huang W.-Z.
    Yin Q.-Y.
    Zhang J.-G.
    Huang K.-Q.
    Ruan Jian Xue Bao/Journal of Software, 2023, 34 (06): : 2765 - 2775
  • [23] Exploration in Relational Domains for Model-based Reinforcement Learning
    Lang, Tobias
    Toussaint, Marc
    Kersting, Kristian
    JOURNAL OF MACHINE LEARNING RESEARCH, 2012, 13 : 3725 - 3768
  • [24] A Model-based Factored Bayesian Reinforcement Learning Approach
    Wu, Bo
    Feng, Yanpeng
    Zheng, Hongyan
    APPLIED SCIENCE, MATERIALS SCIENCE AND INFORMATION TECHNOLOGIES IN INDUSTRY, 2014, 513-517 : 1092 - 1095
  • [25] Free Will Belief as a Consequence of Model-Based Reinforcement Learning
    Rehn, Erik M.
    ARTIFICIAL GENERAL INTELLIGENCE, AGI 2022, 2023, 13539 : 353 - 363
  • [26] Offline Model-Based Reinforcement Learning for Tokamak Control
    Char, Ian
    Abbate, Joseph
    Bardoczi, Laszlo
    Boyer, Mark D.
    Chung, Youngseog
    Conlin, Rory
    Erickson, Keith
    Mehta, Viraj
    Richner, Nathan
    Kolemen, Egemen
    Schneider, Jeff
    LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211
  • [27] Model-Based Off-Policy Deep Reinforcement Learning With Model-Embedding
    Tan, Xiaoyu
    Qu, Chao
    Xiong, Junwu
    Zhang, James
    Qiu, Xihe
    Jin, Yaochu
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, 8 (04): : 2974 - 2986
  • [28] Model-Based Reinforcement Learning for Humanoids: A Study on Forming Rewards with the iCub platform
    Fachantidis, Anestis
    Di Nuovo, Alessandro
    Cangelosi, Angelo
    Vlahavas, Ioannis
    2013 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE, COGNITIVE ALGORITHMS, MIND, AND BRAIN (CCMB), 2013, : 87 - 93
  • [29] Uncertainty-Aware Model-Based Offline Reinforcement Learning for Automated Driving
    Diehl, Christopher
    Sievernich, Timo Sebastian
    Kruger, Martin
    Hoffmann, Frank
    Bertram, Torsten
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (02) : 1167 - 1174
  • [30] Case-Based Task Generalization in Model-Based Reinforcement Learning
    Zholus, Artem
    Panov, Aleksandr, I
    ARTIFICIAL GENERAL INTELLIGENCE, AGI 2021, 2022, 13154 : 344 - 354