Model-based reinforcement learning for alternating Markov games

被引：0

作者：

Mellor, D ^{[1
]}

机构：

[1] Univ Newcastle, Sch Elect Engn & Comp Sci, Callaghan, NSW 2308, Australia

来源：

AI 2003: ADVANCES IN ARTIFICIAL INTELLIGENCE | 2003年 / 2903卷

关键词：

game playing; machine learning; planning; reinforcement learning; search;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Online training is a promising technique for training reinforcement learning agents to play strategy board games over the internet against human opponents. But the limited training experience that can be generated by playing against real humans online means that learning must be data-efficient. Data-efficiency has been achieved in other domains by augmenting reinforcement learning with a model: model-based reinforcement learning. In this paper the Minimax-MBTD algorithm is presented, which extends model-based reinforcement learning to deterministic alternating Markov games, a generalisation of two-player zero-sum strategy board games like chess and Go. By using a minimax measure of optimality the strategy learnt generalises to arbitrary opponents, unlike approaches that explicitly model specific opponents. Minimax-MBTD is applied to Tic-Tac-Toe and found to converge faster than direct reinforcement learning, but focussing planning on successors to the current state resulted in slower convergence than unfocussed random planning.

引用

页码：520 / 531

页数：12

共 50 条

[31] Sequential Monte Carlo Samplers for Model-Based Reinforcement Learning
Sonmez, Orhan
Cemgil, A. Taylan
2013 21ST SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2013,
[32] Model-Based Reinforcement Learning via Stochastic Hybrid Models
Abdulsamad, Hany
Peters, Jan
IEEE OPEN JOURNAL OF CONTROL SYSTEMS, 2023, 2 : 155 - 170
[33] Model-Based Offline Reinforcement Learning for Autonomous Delivery of Guidewire
Li, Hao
Zhou, Xiao-Hu
Xie, Xiao-Liang
Liu, Shi-Qi
Feng, Zhen-Qiu
Gui, Mei-Jiang
Xiang, Tian-Yu
Huang, De-Xing
Hou, Zeng-Guang
IEEE TRANSACTIONS ON MEDICAL ROBOTICS AND BIONICS, 2024, 6 (03): : 1054 - 1062
[34] Federated Ensemble Model-Based Reinforcement Learning in Edge Computing
Wang, Jin
Hu, Jia
Mills, Jed
Min, Geyong
Xia, Ming
Georgalas, Nektarios
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2023, 34 (06) : 1848 - 1859
[35] Model-based hierarchical reinforcement learning and human action control
Botvinick, Matthew
Weinstein, Ari
PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2014, 369 (1655)
[36] Model-based Reinforcement Learning for Ship Path Following with Disturbances
Dong, Zhengyang
Chen, Linying
Huang, Yamin
Chen, Pengfei
Mou, Junmin
IFAC PAPERSONLINE, 2024, 58 (20): : 247 - 252
[37] Intrinsic Motivation in Model-Based Reinforcement Learning: A Brief Review
A. K. Latyshev
A. I. Panov
Scientific and Technical Information Processing, 2024, 51 (5) : 460 - 470
[38] Model-based deep reinforcement learning for wind energy bidding
Sanayha, Manassakan
Vateekul, Peerapon
INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2022, 136
[39] Model-Based Reinforcement Learning with Automated Planning for Network Management
Ordonez, Armando
Mauricio Caicedo, Oscar
Villota, William
Rodriguez-Vivas, Angela
da Fonseca, Nelson L. S.
SENSORS, 2022, 22 (16)
[40] Reward-respecting subtasks for model-based reinforcement learning
Suttona, Richard S.
Machado, Marlos C.
Holland, Zacharias
Szepesvari, David
Timbers, Finbarr
Tanner, Brian
White, Adam
ARTIFICIAL INTELLIGENCE, 2023, 324

← 1 2 3 4 5 →