Reward Specifications in Collaborative Multi-agent Learning: A Comparative Study

被引：0

作者：

Hasan, Maram ^{[1
]}

Niyogi, Rajdeep ^{[1
]}

机构：

[1] IIT, Roorkee, Uttarakhand, India

来源：

39TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2024 | 2024年

关键词：

Collaborative Learning; Multi-Agent Reinforcement Learning; Reward Specifications; Intrinsic Rewards; Reward Shaping;

D O I：

10.1145/3605098.3636028

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Reinforcement learning is a prominent learning paradigm that seeks to maximize cumulative rewards over time. Nevertheless, some real-life problems often exhibit inherent sparsity in rewards, which pose difficulties for standard reinforcement learning algorithms in efficiently learning optimal policies without frequent feedback. In multi-agent environments, reward specifications play a crucial role in collaborative learning by designing reward structures that guide agents toward desired behaviors and effectively addressing the challenge of sparse rewards. This paper presents a new study that explores the impact of reward specification techniques on collaborative learning in multi-agent environments. In our experiments, we use state-of-the-art multi-agent reinforcement learning (MARL) algorithms, which have been proven to be effective under dense reward environments, along with different reward specifications with a focus on evaluating their performance under sparsity settings in a variety of environments, including discrete and complex scenarios. In addition, we provide in-depth insights on how diverse factors, such as task nature and information availability, influence the reward specification's impact concerning agent learning and coordination. To assess these aspects, we examine the average team rewards and convergence speed. The results highlight the importance of reward specifications, aiding researchers and practitioners in selecting effective techniques for various real-world collaborative problems.

引用

页码：1007 / 1013

页数：7

共 28 条

[1]

Ahilan S, 2019, Arxiv, DOI arXiv:1901.08492

[2] An Evaluation Study of Intrinsic Motivation Techniques Applied to Reinforcement Learning over Hard Exploration Environments [J].

Andres, Alain ;

Villar-Rodriguez, Esther ;

Del Ser, Javier .

MACHINE LEARNING AND KNOWLEDGE EXTRACTION, CD-MAKE 2022, 2022, 13480 :201-220

[3]

Arjona-Medina JA, 2019, ADV NEUR IN, V32

[4]

Aubret A, 2019, Arxiv, DOI arXiv:1908.06976

[5]

Bellemare MG, 2016, ADV NEUR IN, V29

[6]

Burda Y, 2018, Arxiv, DOI arXiv:1808.04355

[7]

Christianos F, 2021, Arxiv, DOI arXiv:2006.07169

[8] Multi-Agent Deep Reinforcement Learning for Large-Scale Traffic Signal Control [J].

Chu, Tianshu ;

Wang, Jie ;

Codeca, Lara ;

Li, Zhaojian .

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2020, 21 (03) :1086-1095

[9]

Devlin S, 2011, 10 INT C AUT AG MULT, V1, P225

[10]

Devlin S. M., 2012, P 11 INT C AUT AG MU, P433, DOI DOI 10.5555/2343576.2343638

← 1 2 3 →