LOSS OF PLASTICITY IN CONTINUAL DEEP REINFORCEMENT LEARNING

被引：0

作者：

Abbas, Zaheer ^{[1
]}

Zhao, Rosie ^{[2
]}

Modayil, Joseph ^{[1
]}

White, Adam ^{[3
,4
,5
]}

Machado, Marlos C. ^{[3
,4
,5
]}

机构：

[1] DeepMind, London, England

[2] Harvard Univ, Cambridge, MA 02138 USA

[3] Univ Alberta, Dept Comp Sci, Edmonton, AB, Canada

[4] Alberta Machine Intelligence Inst, Edmonton, AB, Canada

[5] CIFAR AI Chair, Toronto, ON, Canada

来源：

CONFERENCE ON LIFELONG LEARNING AGENTS, VOL 232 | 2023年 / 232卷

关键词：

ENVIRONMENT;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we characterize the behavior of canonical value-based deep reinforcement learning (RL) approaches under varying degrees of non-stationarity. In particular, we demonstrate that deep RL agents lose their ability to learn good policies when they cycle through a sequence of Atari 2600 games. This phenomenon is alluded to in prior work under various guises-e.g., loss of plasticity, implicit under-parameterization, primacy bias, and capacity loss. We investigate this phenomenon closely at scale and analyze how the weights, gradients, and activations change over time in several experiments with varying experimental conditions (e.g., similarity between games, number of games, number of frames per game). Our analysis shows that the activation footprint of the network becomes sparser, contributing to the diminishing gradients. We investigate a remarkably simple mitigation strategy-Concatenated ReLUs (CReLUs) activation function-and demonstrate its effectiveness in facilitating continual learning in a changing environment.

引用

页码：620 / 636

页数：17

共 50 条

[1] Loss of plasticity in deep continual learning
Dohare, Shibhansh
Hernandez-Garcia, J. Fernando
Lan, Qingfeng
Rahman, Parash
Mahmood, A. Rupam
Sutton, Richard S.
NATURE, 2024, 632 (8026) : 768 - 774
[2] Deep Reinforcement Learning with Plasticity Injection
Nikishin, Evgenii
Oh, Junhyuk
Ostrovski, Georg
Lyle, Clare
Pascanu, Razvan
Dabney, Will
Barreto, Andre
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[3] Solving Continual Combinatorial Selection via Deep Reinforcement Learning
Song, Hyungseok
Jang, Hyeryung
Tran, Hai H.
Yoon, Se-eun
Son, Kyunghwan
Yun, Donggyu
Chung, Hyoju
Yi, Yung
PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 3467 - 3474
[4] Continual Deep Reinforcement Learning to Prevent Catastrophic Forgetting in Jamming Mitigation
Davaslioglu, Kemal
Kompella, Sastry
Erpek, Tugba
Sagduyu, Yalin E.
arXiv,
[5] Continual deep reinforcement learning with task-agnostic policy distillation
Hafez, Muhammad Burhan
Erekmen, Kerim
SCIENTIFIC REPORTS, 2024, 14 (01):
[6] Deep Reinforcement Learning amidst Continual Structured Non-Stationarity
Xie, Annie
Harrison, James
Finn, Chelsea
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[7] Continual Deep Reinforcement Learning with Task-Agnostic Policy Distillation
Hafez, Muhammad Burhan
Erekmen, Kerim
arXiv,
[8] Continual Deep Reinforcement Learning to Prevent Catastrophic Forgetting in Jamming Mitigation
Nexcepta, Gaithersburg
MD, United States
Proc IEEE Mil Commun Conf MILCOM, 2024, (740-745):
[9] Continual learning, deep reinforcement learning, and microcircuits: a novel method for clever game playing
Chang O.
Ramos L.
Morocho-Cayamcela M.E.
Armas R.
Zhinin-Vera L.
Multimedia Tools and Applications, 2025, 84 (3) : 1537 - 1559
[10] Continual World: A Robotic Benchmark For Continual Reinforcement Learning
Wolczyk, Maciej
Zajac, Michal
Pascanu, Razvan
Kucinski, Lukasz
Milos, Piotr
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34

← 1 2 3 4 5 →