ACNMP: Skill Transfer and Task Extrapolation through Learning from Demonstration and Reinforcement Learning via Representation Sharing

被引：0

作者：

Akbulut, M. Tuluhan ^{[1
]}

Oztop, Erhan ^{[2
]}

Seker, M. Yunus ^{[1
]}

Xue, Honghu ^{[3
]}

Tekden, Ahmet E. ^{[1
]}

Ugur, Emre ^{[1
]}

机构：

[1] Bogazici Univ, Istanbul, Turkey

[2] Ozyegin Univ, Istanbul, Turkey

[3] Univ Lubeck, Lubeck, Germany

来源：

CONFERENCE ON ROBOT LEARNING, VOL 155 | 2020年 / 155卷

关键词：

Learning from Demonstration; Reinforcement Learning; Deep Learning; Representation Learning;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

To equip robots with dexterous skills, an effective approach is to first transfer the desired skill via Learning from Demonstration (LfD), then let the robot improve it by self-exploration via Reinforcement Learning (RL). In this paper, we propose a novel LfD+RL framework, namely Adaptive Conditional Neural Movement Primitives (ACNMP), that allows efficient policy improvement in novel environments and effective skill transfer between different agents. This is achieved through exploiting the latent representation learned by the underlying Conditional Neural Process (CNP) model, and simultaneous training of the model with supervised learning (SL) for acquiring the demonstrated trajectories and via RL for new trajectory discovery. Through simulation experiments, we show that (i) ACNMP enables the system to extrapolate to situations where pure LfD fails; (ii) Simultaneous training of the system through SL and RL preserves the shape of demonstrations while adapting to novel situations due to the shared representations used by both learners; (iii) ACNMP enables order-of-magnitude sample-efficient RL in extrapolation of reaching tasks compared to the existing approaches; (iv) ACNMPs can be used to implement skill transfer between robots having different morphology, with competitive learning speeds and importantly with less number of assumptions compared to the state-of-the-art approaches. Finally, we show the real-world suitability of ACNMPs through real robot experiments that involve obstacle avoidance, pick and place and pouring actions.

引用

页码：1896 / 1907

页数：12

共 50 条

[1] Robotic Disassembly Task Training and Skill Transfer Using Reinforcement Learning
Qu, Mo
Wang, Yongjing
Pham, Duc Truong
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2023, 19 (11) : 10934 - 10943
[2] Representation and Reinforcement Learning for Task Scheduling in Edge Computing
Tang, Zhiqing
Jia, Weijia
Zhou, Xiaojie
Yang, Wenmian
You, Yongjian
IEEE TRANSACTIONS ON BIG DATA, 2022, 8 (03) : 795 - 808
[3] Reusing Source Task Knowledge via Transfer Approximator in Reinforcement Transfer Learning
Cheng, Qiao
Wang, Xiangke
Niu, Yifeng
Shen, Lincheng
SYMMETRY-BASEL, 2019, 11 (01):
[4] WMRA Skill Learning Through Segmentation of Demonstration
Yao Yufeng
Chi Mingshan
Liu Yaxin
Du Qilong
Wang Zhaomin
2017 2ND INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS AND MECHATRONICS (ICARM), 2017, : 540 - 545
[5] Improving Deep Reinforcement Learning via Transfer
Du, Yunshu
AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 2405 - 2407
[6] Skill Learning for Robotic Insertion Based on One-shot Demonstration and Reinforcement Learning
Ying Li
De Xu
International Journal of Automation and Computing, 2021, 18 : 457 - 467
[7] Skill Learning for Robotic Insertion Based on One-shot Demonstration and Reinforcement Learning
Li, Ying
Xu, De
INTERNATIONAL JOURNAL OF AUTOMATION AND COMPUTING, 2021, 18 (03) : 457 - 467
[8] Skill Learning for Robotic Insertion Based on One-shot Demonstration and Reinforcement Learning
Ying Li
De Xu
International Journal of Automation and Computing, 2021, 18 (03) : 457 - 467
[9] Adaptive Curriculum Learning: Optimizing Reinforcement Learning through Dynamic Task Sequencing
Nesterova, M.
Skrynnik, A.
Panov, A.
OPTICAL MEMORY AND NEURAL NETWORKS, 2024, 33 (SUPPL3) : S435 - S444
[10] Opportunistic Bandwidth Sharing Through Reinforcement Learning
Venkatraman, Pavithra
Hamdaoui, Bechir
Guizani, Mohsen
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2010, 59 (06) : 3148 - 3153

← 1 2 3 4 5 →