Phasic Self-Imitative Reduction for Sparse-Reward Goal-Conditioned Reinforcement Learning

被引:0
|
作者
Li, Yunfei [1 ]
Gao, Tian [1 ]
Yang, Jiaqi [2 ]
Xu, Huazhe [3 ]
Wu, Yi [1 ,4 ]
机构
[1] Tsinghua Univ, Inst Interdisciplinary Informat Sci, Beijing, Peoples R China
[2] Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA USA
[3] Stanford Univ, Stanford, CA USA
[4] Shanghai Qi Zhi Inst, Shanghai, Peoples R China
来源
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162 | 2022年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It has been a recent trend to leverage the power of supervised learning (SL) towards more effective reinforcement learning (RL) methods. We propose a novel phasic approach by alternating online RL and offline SL for tackling sparse-reward goal-conditioned problems. In the online phase, we perform RL training and collect rollout data while in the offline phase, we perform SL on those successful trajectories from the dataset. To further improve sample efficiency, we adopt additional techniques in the online phase including task reduction to generate more feasible trajectories and a value- difference-based intrinsic reward to alleviate the sparse-reward issue. We call this overall algorithm, PhAsic self-Imitative Reduction (PAIR). PAIR substantially outperforms both non-phasic RL and phasic SL baselines on sparse-reward goal-conditioned robotic control problems, including a challenging stacking task. PAIR is the first RL method that learns to stack 6 cubes with only 0/1 success rewards from scratch.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] Representation-Based Robustness in Goal-Conditioned Reinforcement Learning
    Yin, Xiangyu
    Wu, Sihao
    Liu, Jiaxu
    Fang, Meng
    Zhao, Xingyu
    Huang, Xiaowei
    Ruan, Wenjie
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 19, 2024, : 21761 - 21769
  • [22] Compact Goal Representation Learning via Information Bottleneck in Goal-Conditioned Reinforcement Learning
    Zou, Qiming
    Suzuki, Einoshin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (02) : 2368 - 2381
  • [23] Generalizing Goal-Conditioned Reinforcement Learning with Variational Causal Reasoning
    Ding, Wenhao
    Lin, Haohong
    Li, Bo
    Zhao, Ding
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [24] Hierarchical Planning Through Goal-Conditioned Offline Reinforcement Learning
    Li, Jinning
    Tang, Chen
    Tomizuka, Masayoshi
    Zhan, Wei
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (04) : 10216 - 10223
  • [25] Learning Efficient Representations for Goal-conditioned Reinforcement Learning via Tabu Search
    Liang, Tianhao
    Chen, Tianyang
    Chen, Xianwei
    Ren, Qinyuan
    2024 IEEE INTERNATIONAL CONFERENCE ON CYBERNETICS AND INTELLIGENT SYSTEMS, CIS AND IEEE INTERNATIONAL CONFERENCE ON ROBOTICS, AUTOMATION AND MECHATRONICS, RAM, CIS-RAM 2024, 2024, : 328 - 333
  • [26] Adaptive multi-model fusion learning for sparse-reward reinforcement learning
    Park, Giseung
    Jung, Whiyoung
    Han, Seungyul
    Choi, Sungho
    Sung, Youngchul
    NEUROCOMPUTING, 2025, 633
  • [27] Metric Residual Networks for Sample Efficient Goal-Conditioned Reinforcement Learning
    Liu, Bo
    Feng, Yihao
    Liu, Qiang
    Stone, Peter
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 7, 2023, : 8799 - 8806
  • [28] Goal-Conditioned Reinforcement Learning With Disentanglement-Based Reachability Planning
    Qian, Zhifeng
    You, Mingyu
    Zhou, Hongjun
    Xu, Xuanhui
    He, Bin
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (08): : 4721 - 4728
  • [29] Learn Goal-Conditioned Policy with Intrinsic Motivation for Deep Reinforcement Learning
    Liu, Jinxin
    Wang, Donglin
    Tian, Qiangxing
    Chen, Zhengyu
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 7558 - 7566
  • [30] Goal-conditioned offline reinforcement learning through state space partitioning
    Wang, Mianchu
    Jin, Yue
    Montana, Giovanni
    MACHINE LEARNING, 2024, 113 (05) : 2435 - 2465