Deep Reinforcement Learning for Multiagent Systems: A Review of Challenges, Solutions, and Applications

被引:774
作者
Nguyen, Thanh Thi [1 ]
Nguyen, Ngoc Duy [2 ]
Nahavandi, Saeid [2 ]
机构
[1] Deakin Univ, Sch Informat Technol, Burwood Campus, Burwood, Vic 3125, Australia
[2] Deakin Univ, Inst Intelligent Syst Res & Innovat, Waurn Ponds Campus, Waurn Ponds, Vic 3216, Australia
关键词
Mathematical model; Robots; Dynamic programming; Games; Reinforcement learning; Deep learning; Observability; Continuous action space; deep learning; deep reinforcement learning (RL); multiagent; nonstationary; partial observability; review; robotics; survey; DYNAMICS; ROBOTS; GAMES;
D O I
10.1109/TCYB.2020.2977374
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Reinforcement learning (RL) algorithms have been around for decades and employed to solve various sequential decision-making problems. These algorithms, however, have faced great challenges when dealing with high-dimensional environments. The recent development of deep learning has enabled RL methods to drive optimal policies for sophisticated and capable agents, which can perform efficiently in these challenging environments. This article addresses an important aspect of deep RL related to situations that require multiple agents to communicate and cooperate to solve complex tasks. A survey of different approaches to problems related to multiagent deep RL (MADRL) is presented, including nonstationarity, partial observability, continuous state and action spaces, multiagent training schemes, and multiagent transfer learning. The merits and demerits of the reviewed methods will be analyzed and discussed with their corresponding applications explored. It is envisaged that this review provides insights about various MADRL methods and can lead to the future development of more robust and highly useful multiagent learning methods for solving real-world problems.
引用
收藏
页码:3826 / 3839
页数:14
相关论文
共 134 条
[1]  
Abdallah S., 2013, P 12 INT C AUT AG MU, P1045
[2]  
Abdallah S, 2016, J MACH LEARN RES, V17
[3]  
[Anonymous], 2018, IEEE SYS MAN CYBERN, DOI DOI 10.1109/SMC.2018.00682
[4]  
[Anonymous], 2018, P 17 INT C AUT AG
[5]  
[Anonymous], 2012, REINFORCEMENT LEARNI
[6]  
[Anonymous], 2018, IEEE INT CONF ROBOT IEEE INT CONF ROBOT IEEE INT CONF ROBOT IEEE INT CONF ROBOT IEEE INT CONF ROBOT IEEE INT CONF ROBOT IEEE INT CONF ROBOT IEEE INT CONF ROBOT IEEE INT CONF ROBOT IEEE INT CONF ROBOT IEEE INT CONF ROBOT IEEE INT CONF ROBOT IEEE INT CONF ROBOT IEEE INT CONF ROBOT IEEE INT CONF ROBOT IEEE INT CONF ROBOT IEEE INT CONF ROBOT IEEE INT CONF ROBOT IEEE INT CONF ROBOT IEEE INT CONF ROBOT IEEE INT CONF ROBOT IEEE INT CONF ROBOT IEEE INT CONF ROBOT IEEE INT CONF ROBOT
[7]  
[Anonymous], 2018, P 17 INT C AUT AG
[8]  
[Anonymous], 2018, LECT NOTES COMPUT SC, DOI DOI 10.1007/978-3-030-01421-6_24
[9]  
[Anonymous], 1997, ADV NEUR IN
[10]  
[Anonymous], 2015, INT J DISTRIB SENS N, DOI DOI 10.1155/2015/301317