Transferring policy of deep reinforcement learning from simulation to reality for robotics

被引：67

作者：

Ju, Hao ^{[1
]}

Juan, Rongshun ^{[1
]}

Gomez, Randy ^{[2
]}

Nakamura, Keisuke ^{[2
]}

Li, Guangliang ^{[1
]}

机构：

[1] Ocean Univ China, Qingdao, Peoples R China

[2] Honda Res Inst Japan Co Ltd, Wako, Japan

来源：

NATURE MACHINE INTELLIGENCE | 2022年 / 4卷 / 12期

基金：

中国国家自然科学基金;

关键词：

NEURAL-NETWORKS; DOMAINS;

D O I：

10.1038/s42256-022-00573-6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep reinforcement learning has achieved great success in many fields and has shown promise in learning robust skills for robot control in recent years. However, sampling efficiency and safety problems still limit its application to robot control in the real world. One common solution is to train the robot control policy in a simulation environment and transfer it to the real world. However, policies trained in simulations usually have unsatisfactory performance in the real world because simulators inevitably model reality imperfectly. Inspired by biological transfer learning processes in the brains of humans and other animals, sim-to-real transfer reinforcement learning has been proposed and has become a focus of researchers applying reinforcement learning to robotics. Here, we describe state-of-the-art sim-to-real transfer reinforcement learning methods, which are inspired by insights into transfer learning in nature, such as extracting features in common between tasks, enriching training experience, multitask learning, continual learning and fast learning. Our objective is to present a comprehensive survey of the most recent advances in sim-to-real transfer reinforcement learning. We hope it can facilitate the application of deep reinforcement learning to solve complex robot control problems in the real world.

引用

页码：1077 / 1087

页数：11

共 88 条

[1] Learning dexterous in-hand manipulation [J].

Andrychowicz, Marcin ;

Baker, Bowen ;

Chociej, Maciek ;

Jozefowicz, Rafal ;

McGrew, Bob ;

Pachocki, Jakub ;

Petron, Arthur ;

Plappert, Matthias ;

Powell, Glenn ;

Ray, Alex ;

Schneider, Jonas ;

Sidor, Szymon ;

Tobin, Josh ;

Welinder, Peter ;

Weng, Lilian ;

Zaremba, Wojciech .

INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2020, 39 (01) :3-20

[2] Towards explainable deep neural networks (xDNN) [J].

Angelov, Plamen ;

Soares, Eduardo .

NEURAL NETWORKS, 2020, 130 (130) :185-194

[3]

[Anonymous], 2017, PREPRINT

[4]

[Anonymous], 2014, CORR

[5]

Antonova R., 2017, PREPRINT

[6]

Arndt K, 2020, IEEE INT CONF ROBOT, P2725, DOI [10.1109/ICRA40945.2020.9196540, 10.1109/icra40945.2020.9196540]

[7] A survey of inverse reinforcement learning: Challenges, methods and progress [J].

Arora, Saurabh ;

Doshi, Prashant .

ARTIFICIAL INTELLIGENCE, 2021, 297 (297)

[8] SYSTEM IDENTIFICATION - SURVEY [J].

ASTROM, KJ ;

EYKHOFF, P .

AUTOMATICA, 1971, 7 (02) :123-+

[9] The Arcade Learning Environment: An Evaluation Platform for General Agents [J].

Bellemare, Marc G. ;

Naddaf, Yavar ;

Veness, Joel ;

Bowling, Michael .

JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2013, 47 :253-279

[10]

Ben-David Shai, 2006, Adv. Neural Inf. Proces. Syst., DOI DOI 10.7551/MITPRESS/7503.003.0022

← 1 2 3 4 5 6 7 8 9 →