A Multi-Agent Adaptive Co-Evolution Method in Dynamic Environments

被引：0

作者：

Li, Yan ^{[1
]}

Zhang, Huazhi ^{[1
]}

Xu, Weiming ^{[1
]}

Wang, Jianan ^{[1
]}

Wang, Jialu ^{[1
]}

Wang, Suyu ^{[1
]}

机构：

[1] China Univ Min & Technol Beijing, Sch Mech Elect & Informat Engn, Beijing 100083, Peoples R China

来源：

MATHEMATICS | 2023年 / 11卷 / 10期

基金：

中国国家自然科学基金;

关键词：

multi-agent; dynamic environment; co-evolution; adaptive; experience screening; REINFORCEMENT; COOPERATION; FLOCKING;

D O I：

10.3390/math11102379

中图分类号：

O1 [数学];

学科分类号：

0701 ; 070101 ;

摘要：

It is challenging to ensure satisfying co-evolution efficiency for the multi-agents in dynamic environments since during Actor-Critic training there is a high probability of falling into local optimality, failing to adapt to the suddenly changed environment quickly. To solve this problem, this paper proposes a multi-agent adaptive co-evolution method in dynamic environments (ACE-D) based on the classical multi-agent reinforcement learning method MADDPG, which effectively realizes self-adaptive new environments and co-evolution in dynamic environments. First, an experience screening policy is introduced based on the MADDPG method to reduce the negative influence of original environment experience on exploring new environments. Then, an adaptive weighting policy is applied to the policy network, which accordingly generates benchmarks for varying environments and assigns higher weights to those policies that are more beneficial for new environments exploration, so that to save time while promoting adaptability of the agents. Finally, different types of dynamic environments with complexity at different levels are built to verify the co-evolutionary effects of the two policies separately and the ACE-D method comprehensively. The experimental results demonstrate that, compared with a range of other methods, the ACE-D method has obvious advantages helping multi-agent adapt to dynamic environments and preventing them from falling into local optima, with more than 25% improvement in stable reward and more than 23% improvement in training efficiency. The ACE-D method is valuable and commendable to promote the co-evolutionary effect of multi-agent in dynamic environments.

引用

页数：18

共 35 条

[1]

Barekatain M, 2020, PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P3108

[2]

Barto AG, 2003, DISCRETE EVENT DYN S, V13, P41, DOI [10.1023/A:1022140919877, 10.1023/A:1025696116075]

[3] Multi-agent Transfer Learning in Reinforcement Learning-based Ride-sharing Systems [J].

Castagna, Alberto ;

Dusparic, Ivana .

ICAART: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 2, 2022, :120-130

[4] Real-Time 'Actor-Critic' Tracking [J].

Chen, Boyu ;

Wang, Dong ;

Li, Peixia ;

Wang, Shuang ;

Lu, Huchuan .

COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :328-345

[5]

Hou YN, 2017, IEEE SYS MAN CYBERN, P316, DOI 10.1109/SMC.2017.8122622

[6] A Q-Learning Approach to Flocking With UAVs in a Stochastic Environment [J].

Hung, Shao-Ming ;

Givigi, Sidney N. .

IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (01) :186-197

[7] Consensus of General Linear Multi-Agent Systems With Heterogeneous Input and Communication Delays [J].

Jiang, Wei ;

Chen, Yiyang ;

Charalambous, Themistoklis .

IEEE CONTROL SYSTEMS LETTERS, 2021, 5 (03) :851-856

[8]

Li SH, 2019, AAAI CONF ARTIF INTE, P4213

[9] A Multi-Agent Motion Prediction and Tracking Method Based on Non-Cooperative Equilibrium [J].

Li, Yan ;

Zhao, Mengyu ;

Zhang, Huazhi ;

Qu, Yuanyuan ;

Wang, Suyu .

MATHEMATICS, 2022, 10 (01)

[10] An Interactive Self-Learning Game and Evolutionary Approach Based on Non-Cooperative Equilibrium [J].

Li, Yan ;

Zhao, Mengyu ;

Zhang, Huazhi ;

Yang, Fuling ;

Wang, Suyu .

ELECTRONICS, 2021, 10 (23)

← 1 2 3 4 →