Automated design of action advising trigger conditions for multiagent reinforcement A

被引：16

作者：

Wang, Tonghao ^{[1
,2
]}

Peng, Xingguang ^{[2
]}

Wang, Tao ^{[2
]}

Liu, Tong ^{[2
]}

Xu, Demin ^{[2
]}

机构：

[1] Xidian Univ, Sch Artificial Intelligence, Xian 710071, Peoples R China

[2] Northwestern Polytech Univ, Sch Marine Sci & Technol, Xian 710072, Peoples R China

来源：

SWARM AND EVOLUTIONARY COMPUTATION | 2024年 / 85卷

基金：

中国国家自然科学基金;

关键词：

Multiagent reinforcement learning; Action advising; Genetic programming; Multiagent systems;

D O I：

10.1016/j.swevo.2024.101475

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Action advising is a popular and effective approach to accelerating independent multiagent reinforcement learning (MARL), especially for the learning systems that all the agents learn from scratch and the roles of them (advisors or advisees) cannot be predefined. The key component of action advising is the trigger condition, which answers the question of when to advise. Previous works mainly focus on the design of novel trigger conditions manually; however, since those conditions are often designed heuristically, the performance may be affected by the preference of the designers. To this end, this paper tries to solve the action advising problem automatically using genetic programming (GP), an evolutionary computation technique. A framework incorporating GP to action advising is provided, together with a novel population initialization method to enhance the performance. Empirical studies are provided to demonstrate the effectiveness of the proposed framework. More importantly, thanks to the high transparency of GP, comprehensive analysis is also conducted based on the results. Interesting and inspiring insights to the action advising problem are condensed from the discussions, which may provide guidance to future works.

引用

页数：13

共 53 条

[1]

Amir O., 2016, the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI, P804

[2] A novel binary classification approach based on geometric semantic genetic programming [J].

Bakurov, I ;

Castelli, M. ;

Fontanella, F. ;

di Freca, A. Scotto ;

Vanneschi, L. .

SWARM AND EVOLUTIONARY COMPUTATION, 2022, 69

[3]

Barto AG, 1989, ADV NEURAL INFORM PR, V2

[4] Genetic programming for multiple-feature construction on high-dimensional classification [J].

Binh Tran ;

Xue, Bing ;

Zhang, Mengjie .

PATTERN RECOGNITION, 2019, 93 :404-417

[5] Preserving Population Diversity Based on Transformed Semantics in Genetic Programming for Symbolic Regression [J].

Chen, Qi ;

Xue, Bing ;

Zhang, Mengjie .

IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2021, 25 (03) :433-447

[6] Feature Selection to Improve Generalization of Genetic Programming for High-Dimensional Symbolic Regression [J].

Chen, Qi ;

Zhang, Mengjie ;

Xue, Bing .

IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2017, 21 (05) :792-806

[7] Interactive Policy Learning through Confidence-Based Autonomy [J].

Chernova, Sonia ;

Veloso, Manuela .

JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2009, 34 :1-25

[8]

Clouse J. A., 1997, On integrating apprentice learning and reinforcement learning

[9]

Da Silva FL, 2020, AAAI CONF ARTIF INTE, V34, P5792

[10] Agents teaching agents: a survey on inter-agent transfer learning [J].

Da Silva, Felipe Leno ;

Warnell, Garrett ;

Costa, Anna Helena Reali ;

Stone, Peter .

AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2020, 34 (01)

← 1 2 3 4 5 6 →