Model-based reinforcement learning for alternating Markov games

被引:0
|
作者
Mellor, D [1 ]
机构
[1] Univ Newcastle, Sch Elect Engn & Comp Sci, Callaghan, NSW 2308, Australia
来源
AI 2003: ADVANCES IN ARTIFICIAL INTELLIGENCE | 2003年 / 2903卷
关键词
game playing; machine learning; planning; reinforcement learning; search;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Online training is a promising technique for training reinforcement learning agents to play strategy board games over the internet against human opponents. But the limited training experience that can be generated by playing against real humans online means that learning must be data-efficient. Data-efficiency has been achieved in other domains by augmenting reinforcement learning with a model: model-based reinforcement learning. In this paper the Minimax-MBTD algorithm is presented, which extends model-based reinforcement learning to deterministic alternating Markov games, a generalisation of two-player zero-sum strategy board games like chess and Go. By using a minimax measure of optimality the strategy learnt generalises to arbitrary opponents, unlike approaches that explicitly model specific opponents. Minimax-MBTD is applied to Tic-Tac-Toe and found to converge faster than direct reinforcement learning, but focussing planning on successors to the current state resulted in slower convergence than unfocussed random planning.
引用
收藏
页码:520 / 531
页数:12
相关论文
共 50 条
  • [31] Sequential Monte Carlo Samplers for Model-Based Reinforcement Learning
    Sonmez, Orhan
    Cemgil, A. Taylan
    2013 21ST SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2013,
  • [32] Model-Based Reinforcement Learning via Stochastic Hybrid Models
    Abdulsamad, Hany
    Peters, Jan
    IEEE OPEN JOURNAL OF CONTROL SYSTEMS, 2023, 2 : 155 - 170
  • [33] Model-Based Offline Reinforcement Learning for Autonomous Delivery of Guidewire
    Li, Hao
    Zhou, Xiao-Hu
    Xie, Xiao-Liang
    Liu, Shi-Qi
    Feng, Zhen-Qiu
    Gui, Mei-Jiang
    Xiang, Tian-Yu
    Huang, De-Xing
    Hou, Zeng-Guang
    IEEE TRANSACTIONS ON MEDICAL ROBOTICS AND BIONICS, 2024, 6 (03): : 1054 - 1062
  • [34] Federated Ensemble Model-Based Reinforcement Learning in Edge Computing
    Wang, Jin
    Hu, Jia
    Mills, Jed
    Min, Geyong
    Xia, Ming
    Georgalas, Nektarios
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2023, 34 (06) : 1848 - 1859
  • [35] Model-based hierarchical reinforcement learning and human action control
    Botvinick, Matthew
    Weinstein, Ari
    PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2014, 369 (1655)
  • [36] Model-based Reinforcement Learning for Ship Path Following with Disturbances
    Dong, Zhengyang
    Chen, Linying
    Huang, Yamin
    Chen, Pengfei
    Mou, Junmin
    IFAC PAPERSONLINE, 2024, 58 (20): : 247 - 252
  • [37] Intrinsic Motivation in Model-Based Reinforcement Learning: A Brief Review
    A. K. Latyshev
    A. I. Panov
    Scientific and Technical Information Processing, 2024, 51 (5) : 460 - 470
  • [38] Model-based deep reinforcement learning for wind energy bidding
    Sanayha, Manassakan
    Vateekul, Peerapon
    INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2022, 136
  • [39] Model-Based Reinforcement Learning with Automated Planning for Network Management
    Ordonez, Armando
    Mauricio Caicedo, Oscar
    Villota, William
    Rodriguez-Vivas, Angela
    da Fonseca, Nelson L. S.
    SENSORS, 2022, 22 (16)
  • [40] Reward-respecting subtasks for model-based reinforcement learning
    Suttona, Richard S.
    Machado, Marlos C.
    Holland, Zacharias
    Szepesvari, David
    Timbers, Finbarr
    Tanner, Brian
    White, Adam
    ARTIFICIAL INTELLIGENCE, 2023, 324