Challenges of real-world reinforcement learning: definitions, benchmarks and analysis

被引:292
作者
Dulac-Arnold, Gabriel [1 ]
Levine, Nir [2 ]
Mankowitz, Daniel J. [2 ]
Li, Jerry [2 ]
Paduraru, Cosmin [2 ]
Gowal, Sven [2 ]
Hester, Todd [2 ]
机构
[1] Google Res, Paris, France
[2] DeepMind, London, England
关键词
Reinforcement learning; Real-world; Applied reinforcement learning; MDPS; SKILLS; GO;
D O I
10.1007/s10994-021-05961-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reinforcement learning (RL) has proven its worth in a series of artificial domains, and is beginning to show some successes in real-world scenarios. However, much of the research advances in RL are hard to leverage in real-world systems due to a series of assumptions that are rarely satisfied in practice. In this work, we identify and formalize a series of independent challenges that embody the difficulties that must be addressed for RL to be commonly deployed in real-world systems. For each challenge, we define it formally in the context of a Markov Decision Process, analyze the effects of the challenge on state-of-the-art learning algorithms, and present some existing attempts at tackling it. We believe that an approach that addresses our set of proposed challenges would be readily deployable in a large number of real world problems. Our proposed challenges are implemented in a suite of continuous control environments called realworldrl-suite which we propose an as an open-source benchmark.
引用
收藏
页码:2419 / 2468
页数:50
相关论文
共 136 条
[1]  
Abbeel P, 2017, ARXIV170510528
[2]   Autonomous Helicopter Aerobatics through Apprenticeship Learning [J].
Abbeel, Pieter ;
Coates, Adam ;
Ng, Andrew Y. .
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2010, 29 (13) :1608-1639
[3]  
Abdolmaleki, 2020, ARXIV200507513
[4]  
Abdolmaleki A., 2018, 6 INT C LEARN REPR I
[5]   Experience Replay for Real-Time Reinforcement Learning Control [J].
Adam, Sander ;
Busoniu, Lucian ;
Babuska, Robert .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2012, 42 (02) :201-212
[6]   Distributed Deep Reinforcement Learning: Learn How to Play Atari Games in 21 minutes [J].
Adamski, Igor ;
Adamski, Robert ;
Grel, Tomasz ;
Jedrych, Adam ;
Kaczmarek, Kamil ;
Michalewski, Henryk .
HIGH PERFORMANCE COMPUTING, ISC HIGH PERFORMANCE 2018, 2018, 10876 :370-388
[7]  
Agarwal, 2016, PREPRINT ARXIV160603
[8]  
Agarwal, 2019, PREPRINT ARXIV190704
[9]  
Ahn Michael, 2019, P MACHINE LEARNING R, V100
[10]  
ALTMAN E, 1999, STOCH MODEL SER, P1