Reinforcement learning for imitating constrained reaching movements

被引：4

作者：

Guenter, Florent ^{[1
]}

Hersch, Micha ^{[1
]}

Calinon, Sylvain ^{[1
]}

Billard, Aude ^{[1
]}

机构：

[1] Ecole Polytech Fed Lausanne, LASA Lab, CH-1015 Lausanne, Switzerland

来源：

ADVANCED ROBOTICS | 2007年 / 21卷 / 13期

关键词：

programming by demonstration; reinforcement learning; dynamical systems; Gaussian mixture model;

D O I：

暂无

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

The goal of developing algorithms for programming robots by demonstration is to create an easy way of programming robots such that it can be accomplished by anyone. When a demonstrator teaches a task to a robot, he/she shows some ways of fulfilling the task, but not all the possibilities. The robot must then be able to reproduce the task even when unexpected perturbations occur. In this case, it has to learn a new solution. Here, we describe a system to teach the robot constrained reaching tasks. Our system is based on a dynamic system generator modulated by a learned speed trajectory. This system is combined with a reinforcement learning module to allow the robot to adapt the trajectory when facing a new situation, e.g., in the presence of obstacles.

引用

页码：1521 / 1544

页数：24

共 30 条

[1]

ABBEEL P, 2006, P INT C MACH LEARN P, P9

[2] Natural gradient works efficiently in learning [J].

Amari, S .

NEURAL COMPUTATION, 1998, 10 (02) :251-276

[3]

Atkeson CG, 1997, IEEE INT CONF ROBOT, P1706, DOI 10.1109/ROBOT.1997.614389

[4]

Barto AG, 2003, DISCRETE EVENT DYN S, V13, P343

[5]

Bertsekas D., 1996, NEURO DYNAMIC PROGRA, V1st

[6]

BILLARD A, 2004, ROBOTICS AUTONOMOUS

[7] Discriminative and adaptive imitation in uni-manual and bi-manual tasks [J].

Billard, Aude G. ;

Calinon, Sylvain ;

Guenter, Florent .

ROBOTICS AND AUTONOMOUS SYSTEMS, 2006, 54 (05) :370-384

[8] Technical update: Least-squares temporal difference learning [J].

Boyan, JA .

MACHINE LEARNING, 2002, 49 (2-3) :233-246

[9]

BRATKE SJ, 1994, P NEUR INF PROC SYST, P393

[10]

CALINON S, 2007, IN PRESS IEEE T SY B, V37

← 1 2 3 →