Reinforcement Learning in RoboCup KeepAway with Partial Observability

被引：0

作者：

Devlin, Sam ^{[1
]}

Grzes, Marek ^{[1
]}

Kudenko, Daniel ^{[1
]}

机构：

[1] Univ York, Dept Comp Sci, York YO10 5DD, N Yorkshire, England

来源：

2009 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCES ON WEB INTELLIGENCE (WI) AND INTELLIGENT AGENT TECHNOLOGIES (IAT), VOL 2 | 2009年

关键词：

Belief state; KeepAway; partial observability; POMDP; reinforcement learning;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Partially observable environments pose a major challenge to the application of reinforcement learning algorithms. In such environments, due to the Markov property frequently being violated in the system state representation, situations can occur where an agent has insufficient information to decide on the optimal action. In such cases, it is necessary to determine when information gathering actions should be executed, that is, when the agent needs to reduce uncertainty about the current state before deciding on how to act. One possible solution that has been proposed in past research is to manually code rules for execution of information gathering actions in the policy using heuristic (and likely faulty) knowledge. However such a solution requires explicit expert knowledge about actions which are information gathering. In this paper a flexible solution is proposed which automatically learns when to execute information gathering actions and furthermore to automatically discover which actions gather information. We present an evaluation in the RoboCup KeepAway domain that empirically shows the robustness of the proposed approach and its success in learning under varying degrees of partial observability. Hence, it eliminates the need for hand-coded rules, is flexible in different situations and does not require knowledge about information gathering actions.

引用

页码：201 / 208

页数：8

共 50 条

[1] Reinforcement learning for RoboCup soccer keepaway
Stone, P
Sutton, RS
Kuhlmann, G
ADAPTIVE BEHAVIOR, 2005, 13 (03) : 165 - 188
[2] Concurrent Hierarchical Reinforcement Learning for RoboCup Keepaway
Bai, Aijun
Russell, Stuart
Chen, Xiaoping
ROBOCUP 2017: ROBOT WORLD CUP XXI, 2018, 11175 : 190 - 203
[3] Argumentation-Based Reinforcement Learning for RoboCup Keepaway
Gao, Yang
Toni, Francesca
Craven, Robert
COMPUTATIONAL MODELS OF ARGUMENT, 2012, 245 : 519 - +
[4] Argumentation-Based Reinforcement Learning for RoboCup Soccer Keepaway
Gao, Yang
Toni, Francesca
Craven, Robert
20TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (ECAI 2012), 2012, 242 : 342 - 347
[5] Equivariant Reinforcement Learning under Partial Observability
Nguyen, Hai
Baisero, Andrea
Klee, David
Wang, Dian
Platt, Robert
Amato, Christopher
CONFERENCE ON ROBOT LEARNING, VOL 229, 2023, 229
[6] Learning of Keepaway Task for RoboCup Soccer Agent Based on Fuzzy Q-Learning
Sawa, Toru
Watanabe, Toshihiko
2011 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2011, : 250 - 256
[7] Explicitly Learning Policy Under Partial Observability in Multiagent Reinforcement Learning
Yang, Chen
Yang, Guangkai
Chen, Hao
Zhang, Junge
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
[8] On Overfitting and Asymptotic Bias in Batch Reinforcement Learning with Partial Observability
Francois-Lavet, Vincent
Rabusseau, Guillaume
Pineau, Joelle
Ernst, Damien
Fonteneau, Raphael
PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 5055 - 5059
[9] Agent Modelling under Partial Observability for Deep Reinforcement Learning
Papoudakis, Georgios
Christianos, Filippos
Albrecht, Stefano V.
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[10] FoLaR: Foggy Latent Representations for Reinforcement Learning with Partial Observability
Meisheri, Hardik
Khadilkar, Harshad
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,

← 1 2 3 4 5 →