Variational Empowerment as Representation Learning for Goal-Based Reinforcement Learning

被引：0

作者：

Choi, Jongwook ^{[1
]}

Sharma, Archit ^{[2
]}

Lee, Honglak ^{[1
,3
]}

Levine, Sergey ^{[4
,5
]}

Gu, Shixiang Shane ^{[4
]}

机构：

[1] Univ Michigan, Ann Arbor, MI 48109 USA

[2] Stanford Univ, Stanford, CA 94305 USA

[3] LG AI Res, Seoul, South Korea

[4] Google Res, Mountain View, CA 94043 USA

[5] Univ Calif Berkeley, Berkeley, CA 94720 USA

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139 | 2021年 / 139卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Learning to reach goal states and learning diverse skills through mutual information (MI) maximization have been proposed as principled frameworks for self-supervised reinforcement learning, allowing agents to acquire broadly applicable multitask policies with minimal reward engineering. Starting from a simple observation that the standard goal-conditioned RL (GCRL) is encapsulated by the optimization objective of variational empowerment, we discuss how GCRL and MI-based RL can be generalized into a single family of methods, which we name variational GCRL (VGCRL), interpreting variational MI maximization, or variational empowerment, as representation learning methods that acquire functionally-aware state representations for goal reaching. This novel perspective allows us to: (1) derive simple but unexplored variants of GCRL to study how adding small representation capacity can already expand its capabilities; (2) investigate how discriminator function capacity and smoothness determine the quality of discovered skills, or latent goals, through modifying latent dimensionality and applying spectral normalization; (3) adapt techniques such as hindsight experience replay (HER) from GCRL to MI-based RL; and lastly, (4) propose a novel evaluation metric, named latent goal reaching (LGR), for comparing empowerment algorithms with different choices of latent dimensionality and discriminator parameterization. Through principled mathematical derivations and careful experimental studies, our work lays a novel foundation from which to evaluate, analyze, and develop representation learning techniques in goal-based RL.

引用

页数：11

共 50 条

[31] SIPLeS: Supporting intermediate smalltalk programming through goal-based learning scenarios
Chee, YS
Xu, SW
ARTIFICIAL INTELLIGENCE IN EDUCATION: KNOWLEDGE AND MEDIA IN LEARNING SYSTEMS, 1997, 39 : 95 - 102
[32] Web Service Classification Based on Reinforcement Learning and Structured Representation Learning
Sheng, Hankang
Li, Zhangbing
Liu, Jianxun
Zhang, Xiao
2021 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BLOCKCHAIN TECHNOLOGY (AIBT 2021), 2021, : 21 - 27
[33] Contrastive Learning as Goal-Conditioned Reinforcement Learning
Eysenbach, Benjamin
Zhang, Tianjun
Levine, Sergey
Salakhutdinov, Ruslan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[34] Reinforcement Learning Control Based on Multi-Goal Representation Using Hierarchical Heuristic Dynamic Programming
Ni, Zhen
He, Haibo
Zhao, Dongbin
Prokhorov, Danil V.
2012 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2012,
[35] Distributed Variational Representation Learning
Estella-Aguerri, Inaki
Zaidi, Abdellatif
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (01) : 120 - 138
[36] Decoupling Representation Learning from Reinforcement Learning
Stooke, Adam
Lee, Kimin
Abbeel, Pieter
Laskin, Michael
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[37] LEARNING NETWORK REPRESENTATION THROUGH REINFORCEMENT LEARNING
Shen, Siqi
Fu, Yongquan
Jia, Adele Lu
Su, Huayou
Wang, Qinglin
Wang, Chengsong
Dou, Yong
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3537 - 3541
[38] Learning a Belief Representation for Delayed Reinforcement Learning
Liotet, Pierre
Venneri, Erick
Restelli, Marcello
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[39] Integrating Reinforcement Learning with Models of Representation Learning
Jones, Matt
Canas, Fabian
COGNITION IN FLUX, 2010, : 1258 - 1263
[40] Masked Contrastive Representation Learning for Reinforcement Learning
Zhu, Jinhua
Xia, Yingce
Wu, Lijun
Deng, Jiajun
Zhou, Wengang
Qin, Tao
Liu, Tie-Yan
Li, Houqiang
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (03) : 3421 - 3433

← 1 2 3 4 5 →