Learning One-Shot Imitation From Humans Without Humans

被引：45

作者：

Bonardi, Alessandro ^{[1
]}

James, Stephen ^{[2
]}

Davison, Andrew J. ^{[2
]}

机构：

[1] Imperial Coll London, Dept Comp, London SW7 2BU, England

[2] Imperial Coll London, Dyson Robot Lab, London SW7 2BU, England

来源：

IEEE ROBOTICS AND AUTOMATION LETTERS | 2020年 / 5卷 / 02期

基金：

英国工程与自然科学研究理事会;

关键词：

Learning from demonstration; deep learning in robotics and automation; perception for grasping and manipulation; TASKS;

D O I：

10.1109/LRA.2020.2977835

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

Humans can naturally learn to execute a new task by seeing it performed by other individuals once, and then reproduce it in a variety of configurations. Endowing robots with this ability of imitating humans from third person is a very immediate and natural way of teaching new tasks. Only recently, through meta-learning, there have been successful attempts to one-shot imitation learning from humans; however, these approaches require a lot of human resources to collect the data in the real world to train the robot. But is there a way to remove the need for real world human demonstrations during training? We show that with Task-Embedded Control Networks, we can infer control polices by embedding human demonstrations that can condition a control policy and achieve one-shot imitation learning. Importantly, we do not use a real human arm to supply demonstrations during training, but instead leverage domain randomisation in an application that has not been seen before: sim-to-real transfer on humans. Upon evaluating our approach on pushing and placing tasks in both simulation and in the real world, we show that in comparison to a system that was trained on real-world data we are able to achieve similar results by utilising only simulation data. Videos can be found here: https://sites.google.com/view/tecnets-humans.

引用

页码：3533 / 3539

页数：7

共 45 条

[1]

Abbeel P., 2004, Apprenticeship learning via inverse reinforcement learning. pages, P1, DOI [DOI 10.1145/1015330.1015430, 10.1145/1015330.1015430]

[2]

Akgun B, 2012, ACMIEEE INT CONF HUM, P391

[3]

[Anonymous], 2017, P C ROB LEARN

[4]

[Anonymous], 2016, P INT C LEARN REPR

[5]

[Anonymous], 2006, P 15 IEEE INT S ROBO, DOI DOI 10.1109/ROMAN.2006.314458

[6]

[Anonymous], 2017, OPTIMIZATION MODEL F

[7]

[Anonymous], P ROB SCI SYST

[8]

[Anonymous], 2017, CORL

[9]

[Anonymous], 2015, P INT C LEARN REPR I

[10]

Ba L. J., 2016, Layer Nor- malization

← 1 2 3 4 5 →