Learning Robot Motion Control with Demonstration and Advice-Operators

被引：37

作者：

Argall, Brenna D. ^{[1
]}

Browning, Brett ^{[1
]}

Veloso, Manuela ^{[2
]}

机构：

[1] Carnegie Mellon Univ, Inst Robot, Pittsburgh, PA 15213 USA

[2] Carnegie Mellon Univ, Dept Comp Sci, Pittsburgh, PA 15213 USA

来源：

2008 IEEE/RSJ INTERNATIONAL CONFERENCE ON ROBOTS AND INTELLIGENT SYSTEMS, VOLS 1-3, CONFERENCE PROCEEDINGS | 2008年

关键词：

D O I：

10.1109/IROS.2008.4651020

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

As robots become more commonplace within society, the need for tools to enable non-robotics-experts to develop control algorithms, or policies, will increase. Learning from Demonstration (LID) offers one promising approach, where the robot learns a policy from teacher task executions. Our interests lie with robot motion control policies which map world observations to continuous low-level actions. In this work, we introduce Advice-Operator Policy Improvement (A-OPI) as a novel approach for improving policies within LfD. Two distinguishing characteristics of the A-OPI algorithm are data source and continuous state-action space. Within LfD, more example data can improve a policy. In A-OPI, new data is synthesized from a student execution and teacher advice. By contrast, typical demonstration approaches provide the learner with exclusively teacher executions. A-OPI is effective within continuous state-action spaces because high level human advice is translated into continuous-valued corrections on the student execution. This work presents a first implementation of the A-OPI algorithm, validated on a Segway RMP robot performing a spatial positioning task. A-OPI is found to improve task performance, both in success and accuracy. Furthermore, performance is shown to be similar or superior to the typical exclusively teacher demonstrations approach.

引用

页码：399 / 404

页数：6

共 11 条

[1]

ABBEEL P, 2005, P ICML 05

[2]

[Anonymous], 2001, SOLVING UNCERTAIN MA

[3]

Atkeson CG, 1997, ARTIF INTELL REV, V11, P75, DOI 10.1023/A:1006511328852

[4]

ATKESON CG, 1997, P ICML 97

[5]

BENTIVEGNA DC, 2004, THESIS GEORGIA I TEC

[6]

CALINON S, 2007, P HRI 07

[7]

CHERNOVA S, 2007, P AAMAS 07

[8]

GROLLMAN DH, 2007, P ICRA 07

[9]

IJSPEERT A, 2002, P ICRA 02

[10]

NICOLESCU MN, 2003, P AAMAS 03

← 1 2 →