Feature Control as Intrinsic Motivation for Hierarchical Reinforcement Learning

被引:58
作者
Dilokthanakul, Nat [1 ,2 ]
Kaplanis, Christos [1 ]
Pawlowski, Nick [1 ]
Shanahan, Murray [1 ,3 ]
机构
[1] Imperial Coll London, Comp Dept, London SW7 2AZ, England
[2] Vidyasirimedhi Inst Sci & Technol, Rayong 21210, Thailand
[3] DeepMind, London N1C 4AG, England
关键词
Task analysis; Reinforcement learning; Training; Neural networks; Visualization; Trajectory; Learning systems; Auxiliary task; deep reinforcement learning (DRL); hierarchical reinforcement learning (HRL); intrinsic motivation; EXPLORATION;
D O I
10.1109/TNNLS.2019.2891792
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One of the main concerns of deep reinforcement learning (DRL) is the data inefficiency problem, which stems both from an inability to fully utilize data acquired and from naive exploration strategies. In order to alleviate these problems, we propose a DRL algorithm that aims to improve data efficiency via both the utilization of unrewarded experiences and the exploration strategy by combining ideas from unsupervised auxiliary tasks, intrinsic motivation, and hierarchical reinforcement learning (HRL). Our method is based on a simple HRL architecture with a metacontroller and a subcontroller. The subcontroller is intrinsically motivated by the metacontroller to learn to control aspects of the environment, with the intention of giving the agent: 1) a neural representation that is generically useful for tasks that involve manipulation of the environment and 2) the ability to explore the environment in a temporally extended manner through the control of the metacontroller. In this way, we reinterpret the notion of pixel- and feature-control auxiliary tasks as reusable skills that can be learned via an intrinsic reward. We evaluate our method on a number of Atari 2600 games. We found that it outperforms the baseline in several environments and significantly improves performance in one of the hardest games-Montezuma's revenge-for which the ability to utilize sparse data is key. We found that the inclusion of intrinsic reward is crucial for the improvement in the performance and that most of the benefit seems to be derived from the representations learned during training.
引用
收藏
页码:3409 / 3418
页数:10
相关论文
共 50 条
  • [41] Hierarchical Program-Triggered Reinforcement Learning Agents for Automated Driving
    Gangopadhyay, Briti
    Soora, Harshit
    Dasgupta, Pallab
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (08) : 10902 - 10911
  • [42] Motivation Generator: An Empirical Model of Intrinsic Motivation for Learning
    Grebenyuk, Konstantin A.
    IEEE TALE2021: IEEE INTERNATIONAL CONFERENCE ON ENGINEERING, TECHNOLOGY AND EDUCATION, 2021, : 1001 - 1005
  • [43] Hierarchical Reinforcement Learning With Guidance for Multi-Domain Dialogue Policy
    Rohmatillah, Mahdin
    Chien, Jen-Tzung
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 748 - 761
  • [44] Skill-Based Hierarchical Reinforcement Learning for Target Visual Navigation
    Wang, Shuo
    Wu, Zhihao
    Hu, Xiaobo
    Lin, Youfang
    Lv, Kai
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 8920 - 8932
  • [45] Skill-Critic: Refining Learned Skills for Hierarchical Reinforcement Learning
    Hao, Ce
    Weaver, Catherine
    Tang, Chen
    Kawamoto, Kenta
    Tomizuka, Masayoshi
    Zhan, Wei
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (04): : 3625 - 3632
  • [46] Emergent Structuring of Interdependent Affordance Learning Tasks Using Intrinsic Motivation and Empirical Feature Selection
    Ugur, Emre
    Piater, Justus
    IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2017, 9 (04) : 328 - 340
  • [47] Model Predictive Control-Based Value Estimation for Efficient Reinforcement Learning
    Wu, Qizhen
    Liu, Kexin
    Chen, Lei
    IEEE INTELLIGENT SYSTEMS, 2024, 39 (03) : 63 - 72
  • [48] Hierarchical Reinforcement Learning for Conversational Recommendation With Knowledge Graph Reasoning and Heterogeneous Questions
    Yang, Yao-Chun
    Chen, Chiao-Ting
    Lu, Tzu-Yu
    Huang, Szu-Hao
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2023, 16 (05) : 3439 - 3452
  • [49] Guided Cooperation in Hierarchical Reinforcement Learning via Model-Based Rollout
    Wang, Haoran
    Tang, Zeshen
    Sun, Yaoru
    Wang, Fang
    Zhang, Siyu
    Chen, Yeming
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
  • [50] Model-based hierarchical reinforcement learning and human action control
    Botvinick, Matthew
    Weinstein, Ari
    PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2014, 369 (1655)