Teaching Humanoid Robot Reaching Motion by Imitation and Reinforcement Learning

被引:1
作者
Savevska, Kristina [1 ,2 ]
Ude, Ales [1 ]
机构
[1] Jozef Stefan Inst, Dept Automat Biocybernet & Robot, Humanoid & Cognit Robot Lab, Jamova Cesta 39, Ljubljana 1000, Slovenia
[2] Int Postgrad Sch Jozef Stefan, Jamova Cesta 39, Ljubljana 1000, Slovenia
来源
ADVANCES IN SERVICE AND INDUSTRIAL ROBOTICS, RAAD 2023 | 2023年 / 135卷
关键词
Humanoids; Imitation Learning; Reinforcement learning;
D O I
10.1007/978-3-031-32606-6_7
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a user-friendly method for programming humanoid robots without the need for expert knowledge. We propose a combination of imitation learning and reinforcement learning to teach and optimize demonstrated trajectories. An initial trajectory for reinforcement learning is generated using a stable whole-body motion imitation system. The acquired motion is then refined using a stochastic optimal control-based reinforcement learning algorithm called Policy Improvement with Path Integrals with Covariance Matrix Adaptation (PI2-CMA). We tested the approach for programming humanoid robot reaching motion. Our experimental results show that the proposed approach is successful at learning reaching motions while preserving the postural balance of the robot. We also show how a stable humanoid robot trajectory learned in simulation can be effectively adapted to different dynamic environments, e.g. a different simulator or a real robot. The resulting learning methodology allows for quick and efficient optimization of the demonstrated trajectories while also taking into account the constraints of the desired task. The learning methodology was tested in a simulated environment and on the real humanoid robot TALOS.
引用
收藏
页码:53 / 61
页数:9
相关论文
共 18 条
  • [1] A survey of robot learning from demonstration
    Argall, Brenna D.
    Chernova, Sonia
    Veloso, Manuela
    Browning, Brett
    [J]. ROBOTICS AND AUTONOMOUS SYSTEMS, 2009, 57 (05) : 469 - 483
  • [2] Teaching and learning of robot tasks via observation of human performance
    Dillmann, R
    [J]. ROBOTICS AND AUTONOMOUS SYSTEMS, 2004, 47 (2-3) : 109 - 116
  • [3] Dynamical Movement Primitives: Learning Attractor Models for Motor Behaviors
    Ijspeert, Auke Jan
    Nakanishi, Jun
    Hoffmann, Heiko
    Pastor, Peter
    Schaal, Stefan
    [J]. NEURAL COMPUTATION, 2013, 25 (02) : 328 - 373
  • [4] Kajita S, 2001, IROS 2001: PROCEEDINGS OF THE 2001 IEEE/RJS INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-4, P239, DOI 10.1109/IROS.2001.973365
  • [5] Policy search for motor primitives in robotics
    Kober, Jens
    Peters, Jan
    [J]. MACHINE LEARNING, 2011, 84 (1-2) : 171 - 203
  • [6] Koenemann J, 2014, IEEE INT CONF ROBOT, P2806, DOI 10.1109/ICRA.2014.6907261
  • [7] Savevska K, 2021, MECH MACH SCI, V102, P229, DOI 10.1007/978-3-030-75259-0_25
  • [8] Schaal S, 1997, ADV NEUR IN, V9, P1040
  • [9] Stulp F., 2010, 2010 10th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2010), P405, DOI 10.1109/ICHR.2010.5686320
  • [10] Stulp F., 2012, P 10 EUROPEAN WORKSH, V1