Learning From Imperfect Demonstrations From Agents With Varying Dynamics

被引：8

作者：

Cao, Zhangjie ^{[1
]}

Sadigh, Dorsa ^{[1
]}

机构：

[1] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA

来源：

IEEE ROBOTICS AND AUTOMATION LETTERS | 2021年 / 6卷 / 03期

基金：

美国国家科学基金会;

关键词：

Trajectory; Robots; Vehicle dynamics; Measurement; Heuristic algorithms; Task analysis; Reinforcement learning; Imitation learning; learning from demonstrations; robot learning;

D O I：

10.1109/LRA.2021.3068912

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

Imitation learning enables robots to learn from demonstrations. Previous imitation learning algorithms usually assume access to optimal expert demonstrations. However, in many real-world applications, this assumption is limiting. Most collected demonstrations are not optimal or are produced by an agent with slightly different dynamics. We therefore address the problem of imitation learning when the demonstrations can be sub-optimal or be drawn from agents with varying dynamics. We develop a metric composed of a feasibility score and an optimality score to measure how useful a demonstration is for imitation learning. The proposed score enables learning from more informative demonstrations, and disregarding the less relevant demonstrations. Our experiments on four environments in simulation and on a real robot show improved learned policies with higher expected return.

引用

页码：5231 / 5238

页数：8

共 40 条

[1] Abbeel P., 2004, P 21 INT C MACH LEAR, P1, DOI [10.1145/1015330.1015430, DOI 10.1145/1015330.1015430]
[2] Keyframe-based Learning from Demonstration Method and Evaluation
Akgun, Baris
Cakmak, Maya
Jiang, Karl
Thomaz, Andrea L.
[J]. INTERNATIONAL JOURNAL OF SOCIAL ROBOTICS, 2012, 4 (04) : 343 - 355
[3] [Anonymous], 1997, EMPIRICALLY GROUNDED
[4] [Anonymous], 2018, Pybullet physics engine
[5] A survey of robot learning from demonstration
Argall, Brenna D.
Chernova, Sonia
Veloso, Manuela
Browning, Brett
[J]. ROBOTICS AND AUTONOMOUS SYSTEMS, 2009, 57 (05) : 469 - 483
[6] Bain M., 1995, MACHINE INTELLIGENCE, V15, P103
[7] Do You Want Your Autonomous Car To Drive Like You?
Basu, Chandrayee
Yang, Qian
Hungerman, David
Singhal, Mukesh
Dragan, Anca D.
[J]. PROCEEDINGS OF THE 2017 ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION (HRI'17), 2017, : 417 - 425
[8] Brockman Greg, 2016, OPENAI GYM
[9] On learning, representing, and generalizing a task in a humanoid robot
Calinon, Sylvain
Guenter, Florent
Billard, Aude
[J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2007, 37 (02): : 286 - 298
[10] Cao Z., 2020, ARXIV200700178

← 1 2 3 4 →