Meta-Reinforcement Learning With Dynamic Adaptiveness Distillation

被引：5

作者：

Hu, Hangkai ^{[1
]}

Huang, Gao ^{[1
]}

Li, Xiang ^{[1
]}

Song, Shiji ^{[1
]}

机构：

[1] Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2023年 / 34卷 / 03期

基金：

中国国家自然科学基金;

关键词：

Task analysis; Training; Trajectory; Learning systems; Heuristic algorithms; Feature extraction; Benchmark testing; Meta-learning; reinforcement learning (RL); task adaptiveness;

D O I：

10.1109/TNNLS.2021.3105407

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep reinforcement learning is confronted with problems of sampling inefficiency and poor task migration capability. Meta-reinforcement learning (meta-RL) enables meta-learners to utilize the task-solving skills trained on similar tasks and quickly adapt to new tasks. However, meta-RL methods lack enough queries toward the relationship between task-agnostic exploitation of data and task-related knowledge introduced by latent context, limiting their effectiveness and generalization ability. In this article, we develop an algorithm for off-policy meta-RL that can provide the meta-learners with self-oriented cognition toward how they adapt to the family of tasks. In our approach, we perform dynamic task-adaptiveness distillation to describe how the meta-learners adjust the exploration strategy in the meta-training process. Our approach also enables the meta-learners to balance the influence of task-agnostic self-oriented adaption and task-related information through latent context reorganization. In our experiments, our method achieves 10%-20% higher asymptotic reward than probabilistic embeddings for actor-critic RL (PEARL).

引用

页码：1454 / 1464

页数：11

共 50 条

[1] Dynamic Channel Access via Meta-Reinforcement Learning
Lu, Ziyang
Gursoy, M. Cenk
2021 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2021,
[2] Hypernetworks in Meta-Reinforcement Learning
Beck, Jacob
Jackson, Matthew
Vuorio, Risto
Whiteson, Shimon
CONFERENCE ON ROBOT LEARNING, VOL 205, 2022, 205 : 1478 - 1487
[3] Meta-Reinforcement Learning Algorithm Based on Reward and Dynamic Inference
Chen, Jinhao
Zhang, Chunhong
Hu, Zheng
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT III, PAKDD 2024, 2024, 14647 : 223 - 234
[4] Meta-Reinforcement Learning in Non-Stationary and Dynamic Environments
Bing, Zhenshan
Lerch, David
Huang, Kai
Knoll, Alois
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (03) : 3476 - 3491
[5] Prefrontal cortex as a meta-reinforcement learning system
Jane X. Wang
Zeb Kurth-Nelson
Dharshan Kumaran
Dhruva Tirumala
Hubert Soyer
Joel Z. Leibo
Demis Hassabis
Matthew Botvinick
Nature Neuroscience, 2018, 21 : 860 - 868
[6] Offline Meta-Reinforcement Learning for Industrial Insertion
Zhao, Tony Z.
Luo, Jianlan
Sushkov, Oleg
Pevceviciute, Rugile
Heess, Nicolas
Scholz, Jon
Schaal, Stefan
Levine, Sergey
2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2022, 2022, : 6386 - 6393
[7] A Meta-Reinforcement Learning Approach to Process Control
McClement, Daniel G.
Lawrence, Nathan P.
Loewen, Philip D.
Forbes, Michael G.
Backstrom, Johan U.
Gopaluni, R. Bhushan
IFAC PAPERSONLINE, 2021, 54 (03): : 685 - 692
[8] Meta-Reinforcement Learning of Structured Exploration Strategies
Gupta, Abhishek
Mendonca, Russell
Liu, YuXuan
Abbeel, Pieter
Levine, Sergey
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
[9] Unsupervised Curricula for Visual Meta-Reinforcement Learning
Jabri, Allan
Hsu, Kyle
Eysenbach, Benjamin
Gupta, Abhishek
Levine, Sergey
Finn, Chelsea
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[10] Formalising Performance Guarantees in Meta-Reinforcement Learning
Mahony, Amanda
FORMAL METHODS AND SOFTWARE ENGINEERING, ICFEM 2018, 2018, 11232 : 469 - 472

← 1 2 3 4 5 →