Model-based reinforcement learning for alternating Markov games

被引:0
|
作者
Mellor, D [1 ]
机构
[1] Univ Newcastle, Sch Elect Engn & Comp Sci, Callaghan, NSW 2308, Australia
来源
AI 2003: ADVANCES IN ARTIFICIAL INTELLIGENCE | 2003年 / 2903卷
关键词
game playing; machine learning; planning; reinforcement learning; search;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Online training is a promising technique for training reinforcement learning agents to play strategy board games over the internet against human opponents. But the limited training experience that can be generated by playing against real humans online means that learning must be data-efficient. Data-efficiency has been achieved in other domains by augmenting reinforcement learning with a model: model-based reinforcement learning. In this paper the Minimax-MBTD algorithm is presented, which extends model-based reinforcement learning to deterministic alternating Markov games, a generalisation of two-player zero-sum strategy board games like chess and Go. By using a minimax measure of optimality the strategy learnt generalises to arbitrary opponents, unlike approaches that explicitly model specific opponents. Minimax-MBTD is applied to Tic-Tac-Toe and found to converge faster than direct reinforcement learning, but focussing planning on successors to the current state resulted in slower convergence than unfocussed random planning.
引用
收藏
页码:520 / 531
页数:12
相关论文
共 50 条
  • [41] Generating Fingerings for Piano Music with Model-Based Reinforcement Learning
    Gao, Wanxiang
    Zhang, Sheng
    Zhang, Nanxi
    Xiong, Xiaowu
    Shi, Zhaojun
    Sun, Ka
    APPLIED SCIENCES-BASEL, 2023, 13 (20):
  • [42] Model-Based Reinforcement Learning via Proximal Policy Optimization
    Sun, Yuewen
    Yuan, Xin
    Liu, Wenzhang
    Sun, Changyin
    2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 4736 - 4740
  • [43] TEAMSTER: Model-based reinforcement learning for ad hoc teamwork
    Ribeiro, Joao G.
    Rodrigues, Goncalo
    Sardinha, Alberto
    Melo, Francisco S.
    ARTIFICIAL INTELLIGENCE, 2023, 324
  • [44] Advances in model-based reinforcement learning for Adaptive Optics control
    Nousiainen, Jalo
    Engler, Byron
    Kasper, Markus
    Helin, Tapio
    Heritier, Cedric T.
    Rajani, Chang
    ADAPTIVE OPTICS SYSTEMS VIII, 2022, 12185
  • [45] Inventory management of new products in retailers using model-based deep reinforcement learning
    Demizu, Tsukasa
    Fukazawa, Yusuke
    Morita, Hiroshi
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 229
  • [46] Importance-Weighted Variational Inference Model Estimation for Offline Bayesian Model-Based Reinforcement Learning
    Hishinuma, Toru
    Senda, Kei
    IEEE ACCESS, 2023, 11 : 145579 - 145590
  • [47] EEG-based classification of learning strategies : model-based and model-free reinforcement learning
    Kim, Dongjae
    Weston, Charles
    Lee, Sang Wan
    2018 6TH INTERNATIONAL CONFERENCE ON BRAIN-COMPUTER INTERFACE (BCI), 2018, : 146 - 148
  • [48] PAC Reinforcement Learning Algorithm for General-Sum Markov Games
    Zehfroosh, Ashkan
    Tanner, Herbert G.
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (05) : 2821 - 2831
  • [49] Solving Cyber Alert Allocation Markov Games with Deep Reinforcement Learning
    Dunstatter, Noah
    Tahsini, Alireza
    Guirguis, Mina
    Tesic, Jelena
    DECISION AND GAME THEORY FOR SECURITY, 2019, 11836 : 164 - 183
  • [50] Model-Based and Model-Free Replay Mechanisms for Reinforcement Learning in Neurorobotics
    Massi, Elisa
    Barthelemy, Jeanne
    Mailly, Juliane
    Dromnelle, Remi
    Canitrot, Julien
    Poniatowski, Esther
    Girard, Benoit
    Khamassi, Mehdi
    FRONTIERS IN NEUROROBOTICS, 2022, 16