The Hanabi challenge: A new frontier for AI research

被引：133

作者：

Bard, Nolan ^{[1
]}

Foerster, Jakob N. ^{[2
]}

Chandar, Sarath ^{[3
]}

Burch, Neil ^{[1
]}

Lanctot, Marc ^{[1
]}

Song, H. Francis ^{[4
]}

Parisotto, Emilio ^{[5
]}

Dumoulin, Vincent ^{[3
]}

Moitra, Subhodeep ^{[3
]}

Hughes, Edward ^{[4
]}

Dunning, Iain ^{[4
]}

Mourad, Shibl ^{[6
]}

Larochelle, Hugo ^{[3
]}

Bellemare, Marc G. ^{[3
]}

Bowling, Michael ^{[1
]}

机构：

[1] DeepMind, Edmonton, AB, Canada

[2] Univ Oxford, Oxford, England

[3] Google Brain, Montreal, PQ, Canada

[4] DeepMind, London, England

[5] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA

[6] DeepMind, Montreal, PQ, Canada

来源：

ARTIFICIAL INTELLIGENCE | 2020年 / 280卷

关键词：

Multi-agent learning; Challenge paper; Reinforcement learning; Games; Theory of mind; Communication; Imperfect information; Cooperative; ARCADE LEARNING-ENVIRONMENT; COMPREHENSIVE SURVEY; REINFORCEMENT; GAME; GO; POKER;

D O I：

10.1016/j.artint.2019.103216

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

From the early days of computing, games have been important testbeds for studying how well machines can do sophisticated decision making. In recent years, machine learning has made dramatic advances with artificial agents reaching superhuman performance in challenge domains like Go, Atari, and some variants of poker. As with their predecessors of chess, checkers, and backgammon, these game domains have driven research by providing sophisticated yet well-defined challenges for artificial intelligence practitioners. We continue this tradition by proposing the game of Hanabi as a new challenge domain with novel problems that arise from its combination of purely cooperative gameplay with two to five players and imperfect information. In particular, we argue that Hanabi elevates reasoning about the beliefs and intentions of other agents to the foreground. We believe developing novel techniques for such theory of mind reasoning will not only be crucial for success in Hanabi, but also in broader collaborative efforts, especially those with human partners. To facilitate future research, we introduce the open-source Hanabi Learning Environment, propose an experimental framework for the research community to evaluate algorithmic advances, and assess the performance of current state-of-the-art techniques. (C) 2019 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

引用

页数：19

共 85 条

[1] Autonomous agents modelling other agents: A comprehensive survey and open problems [J].

Albrecht, Stefano V. ;

Stone, Peter .

ARTIFICIAL INTELLIGENCE, 2018, 258 :66-95

[2]

[Anonymous], P INT C MACH LEARN I

[3]

[Anonymous], ARXIV171002298 CORR

[4]

[Anonymous], 2018, INT C MACH LEARN

[5]

[Anonymous], HANABI FIREWORKS COM

[6]

[Anonymous], P INT C MACH LEARN I

[7]

[Anonymous], ARXIV160507736 CORR

[8]

[Anonymous], JOINT P AIIDE 2018 W

[9]

[Anonymous], P IEEE C EV COMP 201

[10]

[Anonymous], 2008, ADV NEURAL INFORM PR

← 1 2 3 4 5 6 7 8 9 →