Transferring policy of deep reinforcement learning from simulation to reality for robotics

被引:51
作者
Ju, Hao [1 ]
Juan, Rongshun [1 ]
Gomez, Randy [2 ]
Nakamura, Keisuke [2 ]
Li, Guangliang [1 ]
机构
[1] Ocean Univ China, Qingdao, Peoples R China
[2] Honda Res Inst Japan Co Ltd, Wako, Japan
基金
中国国家自然科学基金;
关键词
NEURAL-NETWORKS; DOMAINS;
D O I
10.1038/s42256-022-00573-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep reinforcement learning has achieved great success in many fields and has shown promise in learning robust skills for robot control in recent years. However, sampling efficiency and safety problems still limit its application to robot control in the real world. One common solution is to train the robot control policy in a simulation environment and transfer it to the real world. However, policies trained in simulations usually have unsatisfactory performance in the real world because simulators inevitably model reality imperfectly. Inspired by biological transfer learning processes in the brains of humans and other animals, sim-to-real transfer reinforcement learning has been proposed and has become a focus of researchers applying reinforcement learning to robotics. Here, we describe state-of-the-art sim-to-real transfer reinforcement learning methods, which are inspired by insights into transfer learning in nature, such as extracting features in common between tasks, enriching training experience, multitask learning, continual learning and fast learning. Our objective is to present a comprehensive survey of the most recent advances in sim-to-real transfer reinforcement learning. We hope it can facilitate the application of deep reinforcement learning to solve complex robot control problems in the real world.
引用
收藏
页码:1077 / 1087
页数:11
相关论文
共 88 条
  • [1] Learning dexterous in-hand manipulation
    Andrychowicz, Marcin
    Baker, Bowen
    Chociej, Maciek
    Jozefowicz, Rafal
    McGrew, Bob
    Pachocki, Jakub
    Petron, Arthur
    Plappert, Matthias
    Powell, Glenn
    Ray, Alex
    Schneider, Jonas
    Sidor, Szymon
    Tobin, Josh
    Welinder, Peter
    Weng, Lilian
    Zaremba, Wojciech
    [J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2020, 39 (01) : 3 - 20
  • [2] Towards explainable deep neural networks (xDNN)
    Angelov, Plamen
    Soares, Eduardo
    [J]. NEURAL NETWORKS, 2020, 130 (130) : 185 - 194
  • [3] Antonova R., 2017, PREPRINT
  • [4] Arndt K, 2020, IEEE INT CONF ROBOT, P2725, DOI [10.1109/ICRA40945.2020.9196540, 10.1109/icra40945.2020.9196540]
  • [5] A survey of inverse reinforcement learning: Challenges, methods and progress
    Arora, Saurabh
    Doshi, Prashant
    [J]. ARTIFICIAL INTELLIGENCE, 2021, 297 (297)
  • [6] SYSTEM IDENTIFICATION - SURVEY
    ASTROM, KJ
    EYKHOFF, P
    [J]. AUTOMATICA, 1971, 7 (02) : 123 - +
  • [7] The Arcade Learning Environment: An Evaluation Platform for General Agents
    Bellemare, Marc G.
    Naddaf, Yavar
    Veness, Joel
    Bowling, Michael
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2013, 47 : 253 - 279
  • [8] Ben-David S, 2006, Advances in Neural Information Processing Systems, P137, DOI DOI 10.7551/MITPRESS/7503.003.0022
  • [9] Berner C., 2019, PREPRINT
  • [10] Bousmalis K, 2016, ADV NEUR IN, V29