Fast Task Adaptation Based on the Combination of Model-Based and Gradient-Based Meta Learning

被引:12
|
作者
Xu, Zhixiong [1 ]
Chen, Xiliang [1 ]
Cao, Lei [1 ]
机构
[1] Army Engn Univ, Inst Command & Control Engn, Nanjing 210000, Peoples R China
关键词
Task analysis; Adaptation models; Reinforcement learning; Trajectory; Games; Data models; Training; Fast adaptation; gradient; metalearning; model-based; reinforcement learning;
D O I
10.1109/TCYB.2020.3028378
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep reinforcement learning (DRL) recently has attained remarkable results in various domains, including games, robotics, and recommender system. Nevertheless, an urgent problem in the practical application of DRL is fast adaptation. To this end, this article proposes a new and versatile metalearning approach called fast task adaptation via metalearning (FTAML), which leverages the strengths of the model-based methods and gradient-based metalearning methods for training the initial parameters of the model, such that the model is able to efficiently master unseen tasks with a little amount of data from the tasks. The proposed algorithm makes it possible to separate task optimization and task identification, specifically, the model-based learner helps to identify the pattern of a task, while the gradient-based metalearner is capable of consistently improving the performance with only a few gradient update steps through making use of the task embedding produced by the model-based learner. In addition, the choice of network for the model-based learner in the proposed method is also discussed, and the performance of networks with different depths is explored. Finally, the simulation results on reinforcement learning problems demonstrate that the proposed approach outperforms compared metalearning algorithms and delivers a new state-of-the-art performance on a variety of challenging control tasks.
引用
收藏
页码:5209 / 5218
页数:10
相关论文
共 50 条
  • [41] Free Will Belief as a Consequence of Model-Based Reinforcement Learning
    Rehn, Erik M.
    ARTIFICIAL GENERAL INTELLIGENCE, AGI 2022, 2023, 13539 : 353 - 363
  • [42] Extraversion differentiates between model-based and model-free strategies in a reinforcement learning task
    Skatova, Anya
    Chan, Patricia A.
    Daw, Nathaniel D.
    FRONTIERS IN HUMAN NEUROSCIENCE, 2013, 7
  • [43] Cognitive components underpinning the development of model-based learning
    Potter, Tracey C. S.
    Bryce, Nessa V.
    Hartley, Catherine A.
    DEVELOPMENTAL COGNITIVE NEUROSCIENCE, 2017, 25 : 272 - 280
  • [44] Two-step gradient-based reinforcement learning for underwater robotics behavior learning
    El-Fakdi, Andres
    Carreras, Marc
    ROBOTICS AND AUTONOMOUS SYSTEMS, 2013, 61 (03) : 271 - 282
  • [45] Offline Model-Based Adaptable Policy Learning for Decision-Making in Out-of-Support Regions
    Chen, Xiong-Hui
    Luo, Fan-Ming
    Yu, Yang
    Li, Qingyang
    Qin, Zhiwei
    Shang, Wenjie
    Ye, Jieping
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (12) : 15260 - 15274
  • [46] Neural Network Model-Based Reinforcement Learning Control for AUV 3-D Path Following
    Ma, Dongfang
    Chen, Xi
    Ma, Weihao
    Zheng, Huarong
    Qu, Fengzhong
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (01): : 893 - 904
  • [47] Evidence for Model-Based Action Planning in a Sequential Finger Movement Task
    Fermin, Alan
    Yoshida, Takehiko
    Ito, Makoto
    Yoshimoto, Junichiro
    Doya, Kenji
    JOURNAL OF MOTOR BEHAVIOR, 2010, 42 (06) : 371 - 379
  • [48] Gradient-based Sharpness Function
    Rudnaya, Maria
    Mattheij, Robert
    Maubach, Joseph
    ter Morsche, Hennie
    WORLD CONGRESS ON ENGINEERING, WCE 2011, VOL I, 2011, : 301 - 306
  • [49] Adaptive Model-Based Reinforcement Learning for Fast-Charging Optimization of Lithium-Ion Batteries
    Hao, Yuhan
    Lu, Qiugang
    Wang, Xizhe
    Jiang, Benben
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2024, 20 (01) : 127 - 137
  • [50] Model-Based Reinforcement Learning with Automated Planning for Network Management
    Ordonez, Armando
    Mauricio Caicedo, Oscar
    Villota, William
    Rodriguez-Vivas, Angela
    da Fonseca, Nelson L. S.
    SENSORS, 2022, 22 (16)