Learning Reward Machines in Cooperative Multi-agent Tasks

被引:2
作者
Ardon, Leo [1 ]
Furelos-Blanco, Daniel [1 ]
Russo, Alessandra [1 ]
机构
[1] Imperial Coll London, London, England
来源
AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS. BEST AND VISIONARY PAPERS, AAMAS 2023 WORKSHOPS | 2024年 / 14456卷
关键词
Multi-Agent Reinforcement Learning; Reward Machines; Neuro-Symbolic; Symbolic Machine Learning;
D O I
10.1007/978-3-031-56255-6_3
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a novel approach to Multi-Agent Reinforcement Learning (MARL) that combines cooperative task decomposition with the learning of Reward Machines (RMs) encoding the structure of the sub-tasks. The proposed method helps deal with the non-Markovian nature of the rewards in partially observable environments and improves the interpretability of the learnt policies required to complete a cooperative task. The RMs associated with the sub-tasks are learnt in a decentralised manner and then used to guide the behaviour of each agent in a team acting towards a common goal. By doing so, the complexity of a cooperative multi-agent problem is reduced, allowing for more effective learning. The results suggest that our approach is a promising direction for future research in cooperative MARL, especially in complex and partially observable environments.
引用
收藏
页码:43 / 59
页数:17
相关论文
共 30 条
[1]  
Albrecht SV., 2023, Multi-agent reinforcement learning: Foundations and modern approaches
[2]   Towards a fully RL-based Market Simulator [J].
Ardon, Leo ;
Vadori, Nelson ;
Spooner, Thomas ;
Xu, Mengda ;
Vann, Jared ;
Ganesh, Sumitra .
ICAIF 2021: THE SECOND ACM INTERNATIONAL CONFERENCE ON AI IN FINANCE, 2021,
[3]   A comprehensive survey of multiagent reinforcement learning [J].
Busoniu, Lucian ;
Babuska, Robert ;
De Schutter, Bart .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2008, 38 (02) :156-172
[4]   Reward Machines for Vision-Based Robotic Manipulation [J].
Camacho, Alberto ;
Varley, Jacob ;
Deng, Andy ;
Jain, Deepali ;
Iscen, Atil ;
Kalashnikov, Dmitry .
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, :14284-14290
[5]  
Camacho A, 2019, PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P6065
[6]  
Christoffersen P.J.K., 2020, P KNOWL REPR REAS ME
[7]  
Dai J, 2014, IEEE DECIS CONTR P, P6173, DOI 10.1109/CDC.2014.7040356
[8]  
Dann M, 2022, PROCEEDINGS OF THE THIRTY-FIRST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2022, P215
[9]  
De Giacomo G, 2020, KR2020: PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PRINCIPLES OF KNOWLEDGE REPRESENTATION AND REASONING, P860
[10]  
Eappen J, 2022, Arxiv, DOI arXiv:2206.13754