Meta-Reinforcement Learning With Dynamic Adaptiveness Distillation

被引:5
|
作者
Hu, Hangkai [1 ]
Huang, Gao [1 ]
Li, Xiang [1 ]
Song, Shiji [1 ]
机构
[1] Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China
基金
中国国家自然科学基金;
关键词
Task analysis; Training; Trajectory; Learning systems; Heuristic algorithms; Feature extraction; Benchmark testing; Meta-learning; reinforcement learning (RL); task adaptiveness;
D O I
10.1109/TNNLS.2021.3105407
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep reinforcement learning is confronted with problems of sampling inefficiency and poor task migration capability. Meta-reinforcement learning (meta-RL) enables meta-learners to utilize the task-solving skills trained on similar tasks and quickly adapt to new tasks. However, meta-RL methods lack enough queries toward the relationship between task-agnostic exploitation of data and task-related knowledge introduced by latent context, limiting their effectiveness and generalization ability. In this article, we develop an algorithm for off-policy meta-RL that can provide the meta-learners with self-oriented cognition toward how they adapt to the family of tasks. In our approach, we perform dynamic task-adaptiveness distillation to describe how the meta-learners adjust the exploration strategy in the meta-training process. Our approach also enables the meta-learners to balance the influence of task-agnostic self-oriented adaption and task-related information through latent context reorganization. In our experiments, our method achieves 10%-20% higher asymptotic reward than probabilistic embeddings for actor-critic RL (PEARL).
引用
收藏
页码:1454 / 1464
页数:11
相关论文
共 50 条
  • [1] Dynamic Channel Access via Meta-Reinforcement Learning
    Lu, Ziyang
    Gursoy, M. Cenk
    2021 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2021,
  • [2] Hypernetworks in Meta-Reinforcement Learning
    Beck, Jacob
    Jackson, Matthew
    Vuorio, Risto
    Whiteson, Shimon
    CONFERENCE ON ROBOT LEARNING, VOL 205, 2022, 205 : 1478 - 1487
  • [3] Meta-Reinforcement Learning Algorithm Based on Reward and Dynamic Inference
    Chen, Jinhao
    Zhang, Chunhong
    Hu, Zheng
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT III, PAKDD 2024, 2024, 14647 : 223 - 234
  • [4] Meta-Reinforcement Learning in Non-Stationary and Dynamic Environments
    Bing, Zhenshan
    Lerch, David
    Huang, Kai
    Knoll, Alois
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (03) : 3476 - 3491
  • [5] Prefrontal cortex as a meta-reinforcement learning system
    Jane X. Wang
    Zeb Kurth-Nelson
    Dharshan Kumaran
    Dhruva Tirumala
    Hubert Soyer
    Joel Z. Leibo
    Demis Hassabis
    Matthew Botvinick
    Nature Neuroscience, 2018, 21 : 860 - 868
  • [6] Offline Meta-Reinforcement Learning for Industrial Insertion
    Zhao, Tony Z.
    Luo, Jianlan
    Sushkov, Oleg
    Pevceviciute, Rugile
    Heess, Nicolas
    Scholz, Jon
    Schaal, Stefan
    Levine, Sergey
    2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2022, 2022, : 6386 - 6393
  • [7] A Meta-Reinforcement Learning Approach to Process Control
    McClement, Daniel G.
    Lawrence, Nathan P.
    Loewen, Philip D.
    Forbes, Michael G.
    Backstrom, Johan U.
    Gopaluni, R. Bhushan
    IFAC PAPERSONLINE, 2021, 54 (03): : 685 - 692
  • [8] Meta-Reinforcement Learning of Structured Exploration Strategies
    Gupta, Abhishek
    Mendonca, Russell
    Liu, YuXuan
    Abbeel, Pieter
    Levine, Sergey
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [9] Unsupervised Curricula for Visual Meta-Reinforcement Learning
    Jabri, Allan
    Hsu, Kyle
    Eysenbach, Benjamin
    Gupta, Abhishek
    Levine, Sergey
    Finn, Chelsea
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [10] Formalising Performance Guarantees in Meta-Reinforcement Learning
    Mahony, Amanda
    FORMAL METHODS AND SOFTWARE ENGINEERING, ICFEM 2018, 2018, 11232 : 469 - 472