Teaching Humanoid Robot Reaching Motion by Imitation and Reinforcement Learning

被引：1

作者：

Savevska, Kristina ^{[1
,2
]}

Ude, Ales ^{[1
]}

机构：

[1] Jozef Stefan Inst, Dept Automat Biocybernet & Robot, Humanoid & Cognit Robot Lab, Jamova Cesta 39, Ljubljana 1000, Slovenia

[2] Int Postgrad Sch Jozef Stefan, Jamova Cesta 39, Ljubljana 1000, Slovenia

来源：

ADVANCES IN SERVICE AND INDUSTRIAL ROBOTICS, RAAD 2023 | 2023年 / 135卷

关键词：

Humanoids; Imitation Learning; Reinforcement learning;

D O I：

10.1007/978-3-031-32606-6_7

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper presents a user-friendly method for programming humanoid robots without the need for expert knowledge. We propose a combination of imitation learning and reinforcement learning to teach and optimize demonstrated trajectories. An initial trajectory for reinforcement learning is generated using a stable whole-body motion imitation system. The acquired motion is then refined using a stochastic optimal control-based reinforcement learning algorithm called Policy Improvement with Path Integrals with Covariance Matrix Adaptation (PI2-CMA). We tested the approach for programming humanoid robot reaching motion. Our experimental results show that the proposed approach is successful at learning reaching motions while preserving the postural balance of the robot. We also show how a stable humanoid robot trajectory learned in simulation can be effectively adapted to different dynamic environments, e.g. a different simulator or a real robot. The resulting learning methodology allows for quick and efficient optimization of the demonstrated trajectories while also taking into account the constraints of the desired task. The learning methodology was tested in a simulated environment and on the real humanoid robot TALOS.

引用

页码：53 / 61

页数：9

共 18 条

[1] A survey of robot learning from demonstration
Argall, Brenna D.
Chernova, Sonia
Veloso, Manuela
Browning, Brett
[J]. ROBOTICS AND AUTONOMOUS SYSTEMS, 2009, 57 (05) : 469 - 483
[2] Teaching and learning of robot tasks via observation of human performance
Dillmann, R
[J]. ROBOTICS AND AUTONOMOUS SYSTEMS, 2004, 47 (2-3) : 109 - 116
[3] Dynamical Movement Primitives: Learning Attractor Models for Motor Behaviors
Ijspeert, Auke Jan
Nakanishi, Jun
Hoffmann, Heiko
Pastor, Peter
Schaal, Stefan
[J]. NEURAL COMPUTATION, 2013, 25 (02) : 328 - 373
[4] Kajita S, 2001, IROS 2001: PROCEEDINGS OF THE 2001 IEEE/RJS INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-4, P239, DOI 10.1109/IROS.2001.973365
[5] Policy search for motor primitives in robotics
Kober, Jens
Peters, Jan
[J]. MACHINE LEARNING, 2011, 84 (1-2) : 171 - 203
[6] Koenemann J, 2014, IEEE INT CONF ROBOT, P2806, DOI 10.1109/ICRA.2014.6907261
[7] Savevska K, 2021, MECH MACH SCI, V102, P229, DOI 10.1007/978-3-030-75259-0_25
[8] Schaal S, 1997, ADV NEUR IN, V9, P1040
[9] Stulp F., 2010, 2010 10th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2010), P405, DOI 10.1109/ICHR.2010.5686320
[10] Stulp F., 2012, P 10 EUROPEAN WORKSH, V1

← 1 2 →