Prediction and Control in Continual Reinforcement Learning

被引：0

作者：

Anand, Nishanth ^{[1
,2
]}

Precup, Doina ^{[1
,3
]}

机构：

[1] McGill Univ, Sch Comp Sci, Montreal, PQ, Canada

[2] Mila, Milan, Italy

[3] Deepmind, London, England

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年

基金：

加拿大自然科学与工程研究理事会;

关键词：

GAME; GO;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Temporal difference (TD) learning is often used to update the estimate of the value function which is used by RL agents to extract useful policies. In this paper, we focus on value function estimation in continual reinforcement learning. We propose to decompose the value function into two components which update at different timescales: a permanent value function, which holds general knowledge that persists over time, and a transient value function, which allows quick adaptation to new situations. We establish theoretical results showing that our approach is well suited for continual learning and draw connections to the complementary learning systems (CLS) theory from neuroscience. Empirically, this approach improves performance significantly on both prediction and control problems.

引用

页数：39

共 50 条

[21] Data-Driven Control of Hydraulic Manipulators by Reinforcement Learning [J].

Yao, Zhikai ;

Xu, Fengyu ;

Jiang, Guo-Ping ;

Yao, Jianyong .

IEEE-ASME TRANSACTIONS ON MECHATRONICS, 2024, 29 (04) :2673-2684

[22] Reinforcement Learning - Overview of recent progress and implications for process control [J].

Shin, Joohyun ;

Badgwell, Thomas A. ;

Liu, Kuang-Hung ;

Lee, Jay H. .

COMPUTERS & CHEMICAL ENGINEERING, 2019, 127 :282-294

[23] A review On reinforcement learning: Introduction and applications in industrial process control [J].

Nian, Rui ;

Liu, Jinfeng ;

Huang, Biao .

COMPUTERS & CHEMICAL ENGINEERING, 2020, 139 (139)

[24] A Deep Reinforcement Learning Network for Traffic Light Cycle Control [J].

Liang, Xiaoyuan ;

Du, Xunsheng ;

Wang, Guiling ;

Han, Zhu .

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2019, 68 (02) :1243-1253

[25] A Reinforcement Learning Approach for Traffic Control [J].

Baumgart, Urs ;

Burger, Michael .

PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON VEHICLE TECHNOLOGY AND INTELLIGENT TRANSPORT SYSTEMS (VEHITS), 2021, :133-141

[26] From Reinforcement Learning to Deep Reinforcement Learning: An Overview [J].

Agostinelli, Forest ;

Hocquet, Guillaume ;

Singh, Sameer ;

Baldi, Pierre .

BRAVERMAN READINGS IN MACHINE LEARNING: KEY IDEAS FROM INCEPTION TO CURRENT STATE, 2018, 11100 :298-328

[27] Quantum Continual Learning Overcoming Catastrophic Forgetting [J].

Jiang, Wenjie ;

Lu, Zhide ;

Deng, Dong-Ling .

CHINESE PHYSICS LETTERS, 2022, 39 (05)

[28] Learning to grow: Control of material self-assembly using evolutionary reinforcement learning [J].

Whitelam, Stephen ;

Tamblyn, Isaac .

PHYSICAL REVIEW E, 2020, 101 (05)

[29] Hidden Link Prediction in Criminal Networks Using the Deep Reinforcement Learning Technique [J].

Lim, Marcus ;

Abdullah, Azween ;

Jhanjhi, N. Z. ;

Supramaniam, Mahadevan .

COMPUTERS, 2019, 8 (01)

[30] Dynamic Target Following Control for Autonomous Vehicles with Deep Reinforcement Learning [J].

Li, Linhai ;

Jiang, Wei ;

Shi, Meiping ;

Wu, Tao .

2022 INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS AND MECHATRONICS (ICARM 2022), 2022, :386-391

← 1 2 3 4 5 →