ACNMP: Skill Transfer and Task Extrapolation through Learning from Demonstration and Reinforcement Learning via Representation Sharing

被引:0
|
作者
Akbulut, M. Tuluhan [1 ]
Oztop, Erhan [2 ]
Seker, M. Yunus [1 ]
Xue, Honghu [3 ]
Tekden, Ahmet E. [1 ]
Ugur, Emre [1 ]
机构
[1] Bogazici Univ, Istanbul, Turkey
[2] Ozyegin Univ, Istanbul, Turkey
[3] Univ Lubeck, Lubeck, Germany
来源
CONFERENCE ON ROBOT LEARNING, VOL 155 | 2020年 / 155卷
关键词
Learning from Demonstration; Reinforcement Learning; Deep Learning; Representation Learning;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To equip robots with dexterous skills, an effective approach is to first transfer the desired skill via Learning from Demonstration (LfD), then let the robot improve it by self-exploration via Reinforcement Learning (RL). In this paper, we propose a novel LfD+RL framework, namely Adaptive Conditional Neural Movement Primitives (ACNMP), that allows efficient policy improvement in novel environments and effective skill transfer between different agents. This is achieved through exploiting the latent representation learned by the underlying Conditional Neural Process (CNP) model, and simultaneous training of the model with supervised learning (SL) for acquiring the demonstrated trajectories and via RL for new trajectory discovery. Through simulation experiments, we show that (i) ACNMP enables the system to extrapolate to situations where pure LfD fails; (ii) Simultaneous training of the system through SL and RL preserves the shape of demonstrations while adapting to novel situations due to the shared representations used by both learners; (iii) ACNMP enables order-of-magnitude sample-efficient RL in extrapolation of reaching tasks compared to the existing approaches; (iv) ACNMPs can be used to implement skill transfer between robots having different morphology, with competitive learning speeds and importantly with less number of assumptions compared to the state-of-the-art approaches. Finally, we show the real-world suitability of ACNMPs through real robot experiments that involve obstacle avoidance, pick and place and pouring actions.
引用
收藏
页码:1896 / 1907
页数:12
相关论文
共 50 条
  • [1] Robotic Disassembly Task Training and Skill Transfer Using Reinforcement Learning
    Qu, Mo
    Wang, Yongjing
    Pham, Duc Truong
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2023, 19 (11) : 10934 - 10943
  • [2] Representation and Reinforcement Learning for Task Scheduling in Edge Computing
    Tang, Zhiqing
    Jia, Weijia
    Zhou, Xiaojie
    Yang, Wenmian
    You, Yongjian
    IEEE TRANSACTIONS ON BIG DATA, 2022, 8 (03) : 795 - 808
  • [3] Reusing Source Task Knowledge via Transfer Approximator in Reinforcement Transfer Learning
    Cheng, Qiao
    Wang, Xiangke
    Niu, Yifeng
    Shen, Lincheng
    SYMMETRY-BASEL, 2019, 11 (01):
  • [4] WMRA Skill Learning Through Segmentation of Demonstration
    Yao Yufeng
    Chi Mingshan
    Liu Yaxin
    Du Qilong
    Wang Zhaomin
    2017 2ND INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS AND MECHATRONICS (ICARM), 2017, : 540 - 545
  • [5] Improving Deep Reinforcement Learning via Transfer
    Du, Yunshu
    AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 2405 - 2407
  • [6] Skill Learning for Robotic Insertion Based on One-shot Demonstration and Reinforcement Learning
    Ying Li
    De Xu
    International Journal of Automation and Computing, 2021, 18 : 457 - 467
  • [7] Skill Learning for Robotic Insertion Based on One-shot Demonstration and Reinforcement Learning
    Li, Ying
    Xu, De
    INTERNATIONAL JOURNAL OF AUTOMATION AND COMPUTING, 2021, 18 (03) : 457 - 467
  • [8] Skill Learning for Robotic Insertion Based on One-shot Demonstration and Reinforcement Learning
    Ying Li
    De Xu
    International Journal of Automation and Computing, 2021, 18 (03) : 457 - 467
  • [9] Adaptive Curriculum Learning: Optimizing Reinforcement Learning through Dynamic Task Sequencing
    Nesterova, M.
    Skrynnik, A.
    Panov, A.
    OPTICAL MEMORY AND NEURAL NETWORKS, 2024, 33 (SUPPL3) : S435 - S444
  • [10] Opportunistic Bandwidth Sharing Through Reinforcement Learning
    Venkatraman, Pavithra
    Hamdaoui, Bechir
    Guizani, Mohsen
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2010, 59 (06) : 3148 - 3153