Challenges of real-world reinforcement learning: definitions, benchmarks and analysis

被引：292

作者：

Dulac-Arnold, Gabriel ^{[1
]}

Levine, Nir ^{[2
]}

Mankowitz, Daniel J. ^{[2
]}

Li, Jerry ^{[2
]}

Paduraru, Cosmin ^{[2
]}

Gowal, Sven ^{[2
]}

Hester, Todd ^{[2
]}

机构：

[1] Google Res, Paris, France

[2] DeepMind, London, England

来源：

MACHINE LEARNING | 2021年 / 110卷 / 09期

关键词：

Reinforcement learning; Real-world; Applied reinforcement learning; MDPS; SKILLS; GO;

D O I：

10.1007/s10994-021-05961-4

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Reinforcement learning (RL) has proven its worth in a series of artificial domains, and is beginning to show some successes in real-world scenarios. However, much of the research advances in RL are hard to leverage in real-world systems due to a series of assumptions that are rarely satisfied in practice. In this work, we identify and formalize a series of independent challenges that embody the difficulties that must be addressed for RL to be commonly deployed in real-world systems. For each challenge, we define it formally in the context of a Markov Decision Process, analyze the effects of the challenge on state-of-the-art learning algorithms, and present some existing attempts at tackling it. We believe that an approach that addresses our set of proposed challenges would be readily deployable in a large number of real world problems. Our proposed challenges are implemented in a suite of continuous control environments called realworldrl-suite which we propose an as an open-source benchmark.

引用

页码：2419 / 2468

页数：50

共 136 条

[1]

Abbeel P, 2017, ARXIV170510528

[2] Autonomous Helicopter Aerobatics through Apprenticeship Learning [J].

Abbeel, Pieter ;

Coates, Adam ;

Ng, Andrew Y. .

INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2010, 29 (13) :1608-1639

[3]

Abdolmaleki, 2020, ARXIV200507513

[4]

Abdolmaleki A., 2018, 6 INT C LEARN REPR I

[5] Experience Replay for Real-Time Reinforcement Learning Control [J].

Adam, Sander ;

Busoniu, Lucian ;

Babuska, Robert .

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2012, 42 (02) :201-212

[6] Distributed Deep Reinforcement Learning: Learn How to Play Atari Games in 21 minutes [J].

Adamski, Igor ;

Adamski, Robert ;

Grel, Tomasz ;

Jedrych, Adam ;

Kaczmarek, Kamil ;

Michalewski, Henryk .

HIGH PERFORMANCE COMPUTING, ISC HIGH PERFORMANCE 2018, 2018, 10876 :370-388

[7]

Agarwal, 2016, PREPRINT ARXIV160603

[8]

Agarwal, 2019, PREPRINT ARXIV190704

[9]

Ahn Michael, 2019, P MACHINE LEARNING R, V100

[10]

ALTMAN E, 1999, STOCH MODEL SER, P1

← 1 2 3 4 5 6 7 8 9 10 →