A survey and critique of multiagent deep reinforcement learning

被引：385

作者：

Hernandez-Leal, Pablo ^{[1
]}

Kartal, Bilal ^{[1
]}

Taylor, Matthew E. ^{[1
]}

机构：

[1] Borealis AI, Edmonton, AB, Canada

来源：

AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS | 2019年 / 33卷 / 06期

关键词：

Multiagent learning; Multiagent systems; Multiagent reinforcement learning; Deep reinforcement learning; Survey; NEURAL-NETWORKS; COMPREHENSIVE SURVEY; FICTITIOUS PLAY; GAMES; ALGORITHMS; LEARNERS; AGENTS; APPROXIMATION; INTELLIGENCE; COORDINATION;

D O I：

10.1007/s10458-019-09421-1

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Deep reinforcement learning (RL) has achieved outstanding results in recent years. This has led to a dramatic increase in the number of applications and methods. Recent works have explored learning beyond single-agent scenarios and have considered multiagent learning (MAL) scenarios. Initial results report successes in complex multiagent domains, although there are several challenges to be addressed. The primary goal of this article is to provide a clear overview of current multiagent deep reinforcement learning (MDRL) literature. Additionally, we complement the overview with a broader analysis: (i) we revisit previous key components, originally presented in MAL and RL, and highlight how they have been adapted to multiagent deep reinforcement learning settings. (ii) We provide general guidelines to new practitioners in the area: describing lessons learned from MDRL works, pointing to recent benchmarks, and outlining open avenues of research. (iii) We take a more critical tone raising practical challenges of MDRL (e.g., implementation and computational demands). We expect this article will help unify and motivate future research to take advantage of the abundant literature that exists (e.g., RL and MAL) in a joint effort to promote fruitful research in the multiagent community.

引用

页码：750 / 797

页数：48

共 368 条

[1]

Agogino A.K., 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems-Volume, P980, DOI DOI 10.1109/AAMAS.2004.10098

[2] Analyzing and visualizing multiagent rewards in dynamic and stochastic domains [J].