ELSIM: End-to-End Learning of Reusable Skills Through Intrinsic Motivation

被引:0
作者
Aubret, Arthur [1 ]
Matignon, Laetitia [1 ]
Hassas, Salima [1 ]
机构
[1] Univ Lyon 1, LIRIS, CNRS, Univ Lyon, F-69622 Villeurbanne, France
来源
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2020, PT II | 2021年 / 12458卷
关键词
Intrinsic motivation; Curriculum learning; Developmental learning; Reinforcement learning; EXPLORATION;
D O I
10.1007/978-3-030-67661-2_32
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Taking inspiration from developmental learning, we present a novel reinforcement learning architecture which hierarchically learns and represents self-generated skills in an end-to-end way. With this architecture, an agent focuses only on task-rewarded skills while keeping the learning process of skills bottom-up. This bottom-up approach allows to learn skills that 1 - are transferable across tasks, 2 - improve exploration when rewards are sparse. To do so, we combine a previously defined mutual information objective with a novel curriculum learning algorithm, creating an unlimited and explorable tree of skills. We test our agent on simple gridworld environments to understand and visualize how the agent distinguishes between its skills. Then we show that our approach can scale on more difficult MuJoCo environments in which our agent is able to build a representation of skills which improves over a baseline both transfer learning and exploration when rewards are sparse.
引用
收藏
页码:541 / 556
页数:16
相关论文
共 47 条
[1]  
Abel D, 2016, PR MACH LEARN RES, V48
[2]  
Achiam J., 2018, ARXIV180710299
[3]  
Akrour R, 2018, IEEE INT C INT ROBOT, P534, DOI 10.1109/IROS.2018.8594201
[4]  
[Anonymous], 2013, Intrinsically motivated learning in natural and artificial systems
[5]  
Aubret A., 2019, ARXIV190806976
[6]  
Bacon PL, 2017, AAAI CONF ARTIF INTE, P1726
[7]   Intrinsically Motivated Goal Exploration for Active Motor Learning in Robots: A Case Study [J].
Baranes, Adrien ;
Oudeyer, Pierre-Yves .
IEEE/RSJ 2010 INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2010), 2010, :1766-1773
[8]   R-IAC: Robust Intrinsically Motivated Exploration and Active Learning [J].
Baranes, Adrien ;
Oudeyer, Pierre-Yves .
IEEE TRANSACTIONS ON AUTONOMOUS MENTAL DEVELOPMENT, 2009, 1 (03) :155-169
[9]  
Barber D, 2004, ADV NEUR IN, V16, P201
[10]  
Chevalier-Boisvert M., 2018, Minimalistic Gridworld Environment for OpenAI Gym