Correct Me If I am Wrong: Interactive Learning for Robotic Manipulation

被引：18

作者：

Chisari, Eugenio ^{[1
]}

Welschehold, Tim ^{[1
]}

Boedecker, Joschka ^{[1
]}

Burgard, Wolfram ^{[1
]}

Valada, Abhinav ^{[1
]}

机构：

[1] Univ Freiburg, Dept Comp Sci, D-70173 Breisgau, Baden Wurttembe, Germany

来源：

IEEE ROBOTICS AND AUTOMATION LETTERS | 2022年 / 7卷 / 02期

关键词：

Robots; Trajectory; Training; Task analysis; Visualization; Computer architecture; Reinforcement learning; Deep learning in grasping and manipulation; human factors and human-in-the-loop; imitation learning; interactive learning; machine learning for robot control;

D O I：

10.1109/LRA.2022.3145516

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

Learning to solve complex manipulation tasks from visual observations is a dominant challenge for real-world robot learning. Although deep reinforcement learning algorithms have recently demonstrated impressive results in this context, they still require an impractical amount of time-consuming trial-and-error iterations. In this work, we consider the promising alternative paradigm of interactive learning in which a human teacher provides feedback to the policy during execution, as opposed to imitation learning where a pre-collected dataset of perfect demonstrations is used. Our proposed CEILing (Corrective and Evaluative Interactive Learning) framework combines both corrective and evaluative feedback from the teacher to train a stochastic policy in an asynchronous manner, and employs a dedicated mechanism to trade off human corrections with the robot's own experience. We present results obtained with our framework in extensive simulation and real-world experiments to demonstrate that CEILing can effectively solve complex robot manipulation tasks directly from raw images in less than one hour of real-world training.

引用

页码：3695 / 3702

页数：8

共 40 条

[1] Autonomous Helicopter Aerobatics through Apprenticeship Learning [J].

Abbeel, Pieter ;

Coates, Adam ;

Ng, Andrew Y. .

INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2010, 29 (13) :1608-1639

[2]

Akrour R, 2014, PR MACH LEARN RES, V32, P1503

[3]

Akrour R, 2011, LECT NOTES ARTIF INT, V6911, P12, DOI 10.1007/978-3-642-23780-5_11

[4]

[Anonymous], 2013, Policy shaping: Integrating human feedback with reinforcement learning

[5]

Arumugam D., 2019, ABS190204257 CORR

[6]

Biyik E., 2020, ROBOT SCI SYST

[7]

Biyik E, 2019, PR MACH LEARN RES, V100

[8]

Blumberg B, 2002, ACM T GRAPHIC, V21, P417, DOI 10.1145/566570.566597

[9] Perspectives on Deep Multimodel Robot Learning [J].

Burgard, Wolfram ;

Valada, Abhinav ;

Radwan, Noha ;

Naseer, Tayyab ;

Zhang, Jingwei ;

Vertens, Johan ;

Mees, Oier ;

Eitel, Andreas ;

Oliveira, Gabriel .

ROBOTICS RESEARCH, 2020, 10 :17-24

[10]

Cederborg T, 2015, PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), P3366

← 1 2 3 4 →