Shaping in Reinforcement Learning via Knowledge Transferred from Human-Demonstrations

被引：0

作者：

Wang Guofang ^{[1
]}

Fang Zhou ^{[1
]}

Li Ping ^{[1
]}

Li Bo ^{[1
]}

机构：

[1] Zhejiang Univ, Sch Aeronaut & Astronaut, Hangzhou 310027, Zhejiang, Peoples R China

来源：

2015 34TH CHINESE CONTROL CONFERENCE (CCC) | 2015年

关键词：

Reinforcement learning; Human-demonstrations; transfer;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Transfer has been widely used to ameliorate the slow convergence speed of reinforcement learning (RL) by reusing the previous obtained knowledge from other related but distinct tasks. In this paper, we propose a framework to transfer knowledge learned directly from human-demonstration trajectories of source tasks to shape the RL algorithm in target task, so as to avoid the time-consuming training process of RL in source tasks and thus we expand the learning paradigm of transfer in RL domains. In our framework, rather than transferring the most common value function or policy, we adopt the visit frequencies of states in successful demonstration trajectories as the acquired knowledge, and then perform transfer via shared agent space. Simulation experiments in obstacle avoidance problems suggest that the transferred knowledge could accelerate the learning process in target task obviously. And as a case study, the experiments show the potential of our framework in knowledge transfer in RL tasks.

引用

页码：3033 / 3038

页数：6

共 23 条

[1] Abbeel P., 2008, Apprenticeship Learning and Reinforcement Learning with Application to Robotic Control
[2] Abbeel P., 2007, ADV NEURAL INFORM PR, V19, P1
[3] Ammar Haitham Bou, 2012, Multi-Agent Systems. 9th European Workshop, EUMAS 2011. Revised Selected Papers, P1, DOI 10.1007/978-3-642-34799-3_1
[4] [Anonymous], 2006, P 23 INT C MACHINE L
[5] [Anonymous], 2010, INT J ROBOTICS RES
[6] [Anonymous], 2020, Reinforcement Learning, An Introduction
[7] [Anonymous], 2003, THESIS U CALIFORNIA
[8] A survey of robot learning from demonstration
Argall, Brenna D.
Chernova, Sonia
Veloso, Manuela
Browning, Brett
[J]. ROBOTICS AND AUTONOMOUS SYSTEMS, 2009, 57 (05) : 469 - 483
[9] Natural actor-critic algorithms
Bhatnagar, Shalabh
Sutton, Richard S.
Ghavamzadeh, Mohammad
Lee, Mark
[J]. AUTOMATICA, 2009, 45 (11) : 2471 - 2482
[10] Busoniu L, 2010, AUTOM CONTROL ENG SE, P1, DOI 10.1201/9781439821091-f

← 1 2 3 →