ACNMP: Skill Transfer and Task Extrapolation through Learning from Demonstration and Reinforcement Learning via Representation Sharing

被引：0

作者：

Akbulut, M. Tuluhan ^{[1
]}

Oztop, Erhan ^{[2
]}

Seker, M. Yunus ^{[1
]}

Xue, Honghu ^{[3
]}

Tekden, Ahmet E. ^{[1
]}

Ugur, Emre ^{[1
]}

机构：

[1] Bogazici Univ, Istanbul, Turkey

[2] Ozyegin Univ, Istanbul, Turkey

[3] Univ Lubeck, Lubeck, Germany

来源：

CONFERENCE ON ROBOT LEARNING, VOL 155 | 2020年 / 155卷

关键词：

Learning from Demonstration; Reinforcement Learning; Deep Learning; Representation Learning;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

To equip robots with dexterous skills, an effective approach is to first transfer the desired skill via Learning from Demonstration (LfD), then let the robot improve it by self-exploration via Reinforcement Learning (RL). In this paper, we propose a novel LfD+RL framework, namely Adaptive Conditional Neural Movement Primitives (ACNMP), that allows efficient policy improvement in novel environments and effective skill transfer between different agents. This is achieved through exploiting the latent representation learned by the underlying Conditional Neural Process (CNP) model, and simultaneous training of the model with supervised learning (SL) for acquiring the demonstrated trajectories and via RL for new trajectory discovery. Through simulation experiments, we show that (i) ACNMP enables the system to extrapolate to situations where pure LfD fails; (ii) Simultaneous training of the system through SL and RL preserves the shape of demonstrations while adapting to novel situations due to the shared representations used by both learners; (iii) ACNMP enables order-of-magnitude sample-efficient RL in extrapolation of reaching tasks compared to the existing approaches; (iv) ACNMPs can be used to implement skill transfer between robots having different morphology, with competitive learning speeds and importantly with less number of assumptions compared to the state-of-the-art approaches. Finally, we show the real-world suitability of ACNMPs through real robot experiments that involve obstacle avoidance, pick and place and pouring actions.

引用

页码：1896 / 1907

页数：12

共 50 条

[21] From recurrent choice to skill learning: A reinforcement-learning model
Fu, Wai-Tat
Anderson, John R.
JOURNAL OF EXPERIMENTAL PSYCHOLOGY-GENERAL, 2006, 135 (02) : 184 - 206
[22] Semantic Extraction for Sentence Representation via Reinforcement Learning
Yu, Fengying
Tao, Dewei
Wang, Jianzong
Hui, Yanfei
Cheng, Ning
Xiao, Jing
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[23] Robot learning from demonstration by constructing skill trees
Konidaris, George
Kuindersma, Scott
Grupen, Roderic
Barto, Andrew
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2012, 31 (03): : 360 - 375
[24] Transfer in Reinforcement Learning via Shared Features
Konidaris, George
Scheidwasser, Ilya
Barto, Andrew G.
JOURNAL OF MACHINE LEARNING RESEARCH, 2012, 13 : 1333 - 1371
[25] Hierarchical Task Decomposition through Symbiosis in Reinforcement Learning
Doucette, John A.
Lichodzijewski, Peter
Heywood, Malcolm I.
PROCEEDINGS OF THE FOURTEENTH INTERNATIONAL CONFERENCE ON GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2012, : 97 - 104
[26] Learning Freehand Ultrasound Through Multimodal Representation and Skill Adaptation
Deng, Xutian
Jiang, Junnan
Cheng, Wen
Yang, Chenguang
Li, Miao
IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2024, : 1 - 14
[27] Individualised Mathematical Task Recommendations Through Intended Learning Outcomes and Reinforcement Learning
Poegelt, Alexander
Ihsberner, Katja
Pengel, Norbert
Kravcik, Milos
Gruettmueller, Martin
Hardt, Wolfram
GENERATIVE INTELLIGENCE AND INTELLIGENT TUTORING SYSTEMS, PT I, ITS 2024, 2024, 14798 : 117 - 130
[28] Optimistic Reinforcement Learning-Based Skill Insertions for Task and Motion Planning
Liu, Gaoyuan
de Winter, Joris
Durodie, Yuri
Steckelmacher, Denis
Nowe, Ann
Vanderborght, Bram
IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (06): : 5974 - 5981
[29] The Outcome-Representation Learning Model: A Novel Reinforcement Learning Model of the Iowa Gambling Task
Haines, Nathaniel
Vassileva, Jasmin
Ahn, Woo-Young
COGNITIVE SCIENCE, 2018, 42 (08) : 2534 - 2561
[30] Task-Driven Semantic Coding via Reinforcement Learning
Li, Xin
Shi, Jun
Chen, Zhibo
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 6307 - 6320

← 1 2 3 4 5 →