Prediction and Control in Continual Reinforcement Learning

被引:0
|
作者
Anand, Nishanth [1 ,2 ]
Precup, Doina [1 ,3 ]
机构
[1] McGill Univ, Sch Comp Sci, Montreal, PQ, Canada
[2] Mila, Milan, Italy
[3] Deepmind, London, England
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年
基金
加拿大自然科学与工程研究理事会;
关键词
GAME; GO;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Temporal difference (TD) learning is often used to update the estimate of the value function which is used by RL agents to extract useful policies. In this paper, we focus on value function estimation in continual reinforcement learning. We propose to decompose the value function into two components which update at different timescales: a permanent value function, which holds general knowledge that persists over time, and a transient value function, which allows quick adaptation to new situations. We establish theoretical results showing that our approach is well suited for continual learning and draw connections to the complementary learning systems (CLS) theory from neuroscience. Empirically, this approach improves performance significantly on both prediction and control problems.
引用
收藏
页数:39
相关论文
共 50 条
  • [1] SLER: Self-generated long-term experience replay for continual reinforcement learning
    Li, Chunmao
    Li, Yang
    Zhao, Yinliang
    Peng, Peng
    Geng, Xupeng
    APPLIED INTELLIGENCE, 2021, 51 (01) : 185 - 201
  • [2] Reinforcement Learning for Control with Multiple Frequencies
    Lee, Jongmin
    Lee, Byung-Jun
    Kim, Kee-Eung
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [3] Deep reinforcement learning control of hydraulic fracturing
    Bangi, Mohammed Saad Faizan
    Kwon, Joseph Sang-Il
    COMPUTERS & CHEMICAL ENGINEERING, 2021, 154
  • [4] Neural Malware Control with Deep Reinforcement Learning
    Wang, Yu
    Stokes, Jack W.
    Marinescu, Mady
    MILCOM 2019 - 2019 IEEE MILITARY COMMUNICATIONS CONFERENCE (MILCOM), 2019,
  • [5] Experience Selection in Deep Reinforcement Learning for Control
    de Bruin, Tim
    Kober, Jens
    Tuyls, Karl
    Babuska, Robert
    JOURNAL OF MACHINE LEARNING RESEARCH, 2018, 19
  • [6] Satellite Attitude Control with Deep Reinforcement Learning
    Gao, Duozhi
    Zhang, Haibo
    Li, Chuanjiang
    Gao, Xinzhou
    2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 4095 - 4101
  • [7] Deep reinforcement learning for quantum gate control
    An, Zheng
    Zhou, D. L.
    EPL, 2019, 126 (06)
  • [8] Integrating Classical Control into Reinforcement Learning Policy
    Huang, Ye
    Gu, Chaochen
    Guan, Xinping
    NEURAL PROCESSING LETTERS, 2021, 53 (03) : 1709 - 1722
  • [9] Autonomous Vehicles Roundup Strategy by Reinforcement Learning with Prediction Trajectory
    Ni, Jiayang
    Ma, Rubing
    Zhong, Hua
    Wang, Bo
    2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 3370 - 3375
  • [10] A Deep Reinforcement Learning Perspective on Internet Congestion Control
    Jay, Nathan
    Rotman, Noga H.
    Godfrey, P. Brighten
    Schapira, Michael
    Tamar, Aviv
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97