Self-supervised Reinforcement Learning with Independently Controllable Subgoals

被引:0
作者
Zadaianchuk, Andrii [1 ,2 ]
Martius, Georg [1 ]
Yang, Fanny [2 ]
机构
[1] Max Planck Inst Intelligent Syst, Tubingen, Germany
[2] Swiss Fed Inst Technol, Dept Comp Sci, Zurich, Switzerland
来源
CONFERENCE ON ROBOT LEARNING, VOL 164 | 2021年 / 164卷
关键词
object-centric representations; relations; self-supervised reinforcement learning;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To successfully tackle challenging manipulation tasks, autonomous agents must learn a diverse set of skills and how to combine them. Recently, self-supervised agents that set their own abstract goals by exploiting the discovered structure in the environment were shown to perform well on many different tasks. In particular, some of them were applied to learn basic manipulation skills in compositional multi-object environments. However, these methods learn skills without taking the dependencies between objects into account. Thus, the learned skills are difficult to combine in realistic environments. We propose a novel self-supervised agent that estimates relations between environment components and uses them to independently control different parts of the environment state. In addition, the estimated relations between objects can be used to decompose a complex goal into a compatible sequence of subgoals. We show that, by using this framework, an agent can efficiently and automatically learn manipulation tasks in multi-object environments with different relations between objects.
引用
收藏
页码:384 / 394
页数:11
相关论文
共 47 条
[1]  
Akakzia A., 2021, INT C LEARNING REPRE
[2]  
Andrychowicz Marcin, 2017, ADV NEURAL INFORM PR, V30
[3]  
Aubret A, 2021, Arxiv, DOI arXiv:2106.03853
[4]   Active learning of inverse models with intrinsically motivated goal exploration in robots [J].
Baranes, Adrien ;
Oudeyer, Pierre-Yves .
ROBOTICS AND AUTONOMOUS SYSTEMS, 2013, 61 (01) :49-73
[5]  
Battaglia PW, 2016, ADV NEUR IN, V29
[6]  
Blaes S., 2019, Advances in Neural Information Processing Systems (NeurIPS)
[7]  
Chung J., 2014, P NIPS 2014 WORKSH D
[8]  
Colas C, 2022, Arxiv, DOI [arXiv:2012.09830, DOI 10.48550/ARXIV.2012.09830]
[9]  
Colas Cedric, 2019, PR MACH LEARN RES, V97
[10]  
Colas Cedric, 2020, ADV NEUR IN, V33