Variational Empowerment as Representation Learning for Goal-Based Reinforcement Learning

被引:0
|
作者
Choi, Jongwook [1 ]
Sharma, Archit [2 ]
Lee, Honglak [1 ,3 ]
Levine, Sergey [4 ,5 ]
Gu, Shixiang Shane [4 ]
机构
[1] Univ Michigan, Ann Arbor, MI 48109 USA
[2] Stanford Univ, Stanford, CA 94305 USA
[3] LG AI Res, Seoul, South Korea
[4] Google Res, Mountain View, CA 94043 USA
[5] Univ Calif Berkeley, Berkeley, CA 94720 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning to reach goal states and learning diverse skills through mutual information (MI) maximization have been proposed as principled frameworks for self-supervised reinforcement learning, allowing agents to acquire broadly applicable multitask policies with minimal reward engineering. Starting from a simple observation that the standard goal-conditioned RL (GCRL) is encapsulated by the optimization objective of variational empowerment, we discuss how GCRL and MI-based RL can be generalized into a single family of methods, which we name variational GCRL (VGCRL), interpreting variational MI maximization, or variational empowerment, as representation learning methods that acquire functionally-aware state representations for goal reaching. This novel perspective allows us to: (1) derive simple but unexplored variants of GCRL to study how adding small representation capacity can already expand its capabilities; (2) investigate how discriminator function capacity and smoothness determine the quality of discovered skills, or latent goals, through modifying latent dimensionality and applying spectral normalization; (3) adapt techniques such as hindsight experience replay (HER) from GCRL to MI-based RL; and lastly, (4) propose a novel evaluation metric, named latent goal reaching (LGR), for comparing empowerment algorithms with different choices of latent dimensionality and discriminator parameterization. Through principled mathematical derivations and careful experimental studies, our work lays a novel foundation from which to evaluate, analyze, and develop representation learning techniques in goal-based RL.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] Representation Learning on Graphs: A Reinforcement Learning Application
    Madjiheurem, Sephora
    Toni, Laura
    22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
  • [42] Goal Misgeneralization in Deep Reinforcement Learning
    Langosco, Lauro
    Koch, Jack
    Sharkey, Lee
    Pfau, Jacob
    Krueger, David
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [43] The Guiding Role of Reward Based on Phased Goal in Reinforcement Learning
    Liu, Yiming
    Hu, Zheng
    ICMLC 2020: 2020 12TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING, 2018, : 535 - 541
  • [44] Goal-based scenarios
    Richards, B
    COMMUNICATIONS OF THE ACM, 1998, 41 (10) : 30 - 31
  • [45] Goal-based Caustics
    Papas, Marios
    Jarosz, Wojciech
    Jakob, Wenzel
    Rusinkiewicz, Szymon
    Matusik, Wojciech
    Weyrich, Tim
    COMPUTER GRAPHICS FORUM, 2011, 30 (02) : 503 - 511
  • [46] The effect of representation and knowledge on goal-directed exploration with reinforcement-learning algorithms
    Koenig, S
    Simmons, RG
    MACHINE LEARNING, 1996, 22 (1-3) : 227 - 250
  • [47] Variational Regret Bounds for Reinforcement Learning
    Ortner, Ronald
    Gajane, Pratik
    Auer, Peter
    35TH UNCERTAINTY IN ARTIFICIAL INTELLIGENCE CONFERENCE (UAI 2019), 2020, 115 : 81 - 90
  • [48] Deep Variational Reinforcement Learning for POMDPs
    Igl, Maximilian
    Zintgraf, Luisa
    Le, Tuan Anh
    Wood, Frank
    Whiteson, Shimon
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [49] RETRACTED: E-Learning Recommender Systems Based on Goal-Based Hybrid Filtering (Retracted Article)
    Chughtai, Muhammad Waseem
    Selamat, Ali
    Ghani, Imran
    Jung, Jason J.
    INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2014,
  • [50] An Empirical Study of Representation Learning for Reinforcement Learning in Healthcare
    Killian, Taylor W.
    Zhang, Haoran
    Subramanian, Jayakumar
    Fatemi, Mehdi
    Ghassemi, Marzyeh
    MACHINE LEARNING FOR HEALTH, VOL 136, 2020, 136 : 139 - +