Full-Body Postural Control of a Humanoid Robot with Both Imitation Learning and Skill Innovation

被引:6
作者
Gonzalez-Fierro, Miguel [1 ]
Balaguer, Carlos [1 ]
Swann, Nicola [2 ]
Nanayakkara, Thrishantha [3 ]
机构
[1] Univ Carlos III Madrid, Robot Lab, Madrid 28912, Spain
[2] Univ Kingston, Sch Life Sci, Kingston Upon Thames KT1 2EE, Surrey, England
[3] Kings Coll London, Ctr Robot Res, London WC2R 2LS, England
关键词
Learning from demonstration; skill innovation; postural control; humanoid robot; DIFFERENTIAL EVOLUTION; MIRROR-NEURONS; CHIMPANZEES; SYSTEM;
D O I
10.1142/S0219843614500121
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
In this paper, we present a novel methodology to obtain imitative and innovative postural movements in a humanoid based on human demonstrations in a different kinematic scale. We collected motion data from a group of human participants standing up from a chair. Modeling the human as an actuated 3-link kinematic chain, and by defining a multi-objective reward function of zero moment point and joint torques to represent the stability and effort, we computed reward profiles for each demonstration. Since individual reward profiles show variability across demonstrating trials, the underlying state transition probabilities were modeled using a Markov chain. Based on the argument that the reward profiles of the robot should show the same temporal structure of those of the human, we used differential evolution to compute a trajectory that fits all humanoid constraints and minimizes the difference between the robot reward profile and the predicted profile if the robot imitates the human. Therefore, robotic imitation involves developing a policy that results in a temporal reward structure, matching that of a group of human demonstrators across an array of demonstrations. Skill innovation was achieved by optimizing a signed reward error after imitation was achieved. Experimental results using the humanoid HOAP-3 are shown.
引用
收藏
页数:34
相关论文
共 53 条
[1]  
Abbeel P., 2004, P 21 INT C MACH LEAR
[2]   Learning the semantics of object-action relations by observation [J].
Aksoy, Eren Erdal ;
Abramov, Alexey ;
Doerr, Johannes ;
Ning, Kejun ;
Dellen, Babette ;
Woergoetter, Florentin .
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2011, 30 (10) :1229-1249
[3]   Imitation with ALICE: Learning to imitate corresponding actions across dissimilar embodiments [J].
Alissandrakis, A ;
Nehaniv, CL ;
Dautenhahn, K .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 2002, 32 (04) :482-496
[4]  
Alissandrakis A., 2004, Interaction Studies, V5, P3
[5]   Correspondence mapping induced state and action metrics for robotic imitation [J].
Alissandrakis, Aris ;
Nehaniv, Chrystopher L. ;
Dautenhahn, Kerstin .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2007, 37 (02) :299-307
[6]  
Alissandrakis A, 2011, ADV INTERACT STUD, V2, P9
[7]  
[Anonymous], 1998, REINFORCEMENT LEARNI
[8]   Learning Robot Motion Control with Demonstration and Advice-Operators [J].
Argall, Brenna D. ;
Browning, Brett ;
Veloso, Manuela .
2008 IEEE/RSJ INTERNATIONAL CONFERENCE ON ROBOTS AND INTELLIGENT SYSTEMS, VOLS 1-3, CONFERENCE PROCEEDINGS, 2008, :399-404
[9]   A survey of robot learning from demonstration [J].
Argall, Brenna D. ;
Chernova, Sonia ;
Veloso, Manuela ;
Browning, Brett .
ROBOTICS AND AUTONOMOUS SYSTEMS, 2009, 57 (05) :469-483
[10]  
Barrios-Aranibar D., 2008, INT J ADV ROBOTICS S