State Representation Learning for Goal-Conditioned Reinforcement Learning

被引:0
|
作者
Steccanella, Lorenzo [1 ]
Jonsson, Anders [1 ]
机构
[1] Univ Pompeu Fabra, Dept Informat & Commun Technol, Barcelona, Spain
基金
欧盟地平线“2020”;
关键词
Representation learning; Goal-conditioned reinforcement learning; Reward shaping; Reinforcement learning;
D O I
10.1007/978-3-031-26412-2_6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a novel state representation for reward-free Markov decision processes. The idea is to learn, in a self-supervised manner, an embedding space where distances between pairs of embedded states correspond to the minimum number of actions needed to transition between them. Compared to previous methods, our approach does not require any domain knowledge, learning from offline and unlabeled data. We show how this representation can be leveraged to learn goalconditioned policies, providing a notion of similarity between states and goals and a useful heuristic distance to guide planning and reinforcement learning algorithms. Finally, we empirically validate our method in classic control domains and multi-goal environments, demonstrating that our method can successfully learn representations in large and/or continuous domains.
引用
收藏
页码:84 / 99
页数:16
相关论文
共 50 条
  • [21] Metric Residual Networks for Sample Efficient Goal-Conditioned Reinforcement Learning
    Liu, Bo
    Feng, Yihao
    Liu, Qiang
    Stone, Peter
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 7, 2023, : 8799 - 8806
  • [22] Goal-Conditioned Reinforcement Learning With Disentanglement-Based Reachability Planning
    Qian, Zhifeng
    You, Mingyu
    Zhou, Hongjun
    Xu, Xuanhui
    He, Bin
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (08): : 4721 - 4728
  • [23] Learn Goal-Conditioned Policy with Intrinsic Motivation for Deep Reinforcement Learning
    Liu, Jinxin
    Wang, Donglin
    Tian, Qiangxing
    Chen, Zhengyu
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 7558 - 7566
  • [24] Highly valued subgoal generation for efficient goal-conditioned reinforcement learning
    Li, Yao
    Wang, YuHui
    Tan, XiaoYang
    NEURAL NETWORKS, 2025, 181
  • [25] Instructing Goal-Conditioned Reinforcement Learning Agents with Temporal Logic Objectives
    Qiu, Wenjie
    Mao, Wensen
    Zhu, He
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [26] Goal-Conditioned Hierarchical Reinforcement Learning With High-Level Model Approximation
    Luo, Yu
    Ji, Tianying
    Sun, Fuchun
    Liu, Huaping
    Zhang, Jianwei
    Jing, Mingxuan
    Huang, Wenbing
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (02) : 2705 - 2719
  • [27] Magnetic Field-Based Reward Shaping for Goal-Conditioned Reinforcement Learning
    Hongyu Ding
    Yuanze Tang
    Qing Wu
    Bo Wang
    Chunlin Chen
    Zhi Wang
    IEEE/CAAJournalofAutomaticaSinica, 2023, 10 (12) : 2233 - 2247
  • [28] Successor Feature Landmarks for Long-Horizon Goal-Conditioned Reinforcement Learning
    Hoang, Christopher
    Sohn, Sungryull
    Choi, Jongwook
    Carvalho, Wilka
    Lee, Honglak
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [29] Autotelic Agents with Intrinsically Motivated Goal-Conditioned Reinforcement Learning: A Short Survey
    Colas, Cedric
    Karch, Tristan
    Sigaud, Olivier
    Oudeyer, Pierre-Yves
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2022, 74 : 1159 - 1199
  • [30] Offline Goal-Conditioned Reinforcement Learning via f-Advantage Regression
    Ma, Yecheng Jason
    Yan, Jason
    Jayaraman, Dinesh
    Bastani, Osbert
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,