LOSS OF PLASTICITY IN CONTINUAL DEEP REINFORCEMENT LEARNING

被引:0
|
作者
Abbas, Zaheer [1 ]
Zhao, Rosie [2 ]
Modayil, Joseph [1 ]
White, Adam [3 ,4 ,5 ]
Machado, Marlos C. [3 ,4 ,5 ]
机构
[1] DeepMind, London, England
[2] Harvard Univ, Cambridge, MA 02138 USA
[3] Univ Alberta, Dept Comp Sci, Edmonton, AB, Canada
[4] Alberta Machine Intelligence Inst, Edmonton, AB, Canada
[5] CIFAR AI Chair, Toronto, ON, Canada
来源
CONFERENCE ON LIFELONG LEARNING AGENTS, VOL 232 | 2023年 / 232卷
关键词
ENVIRONMENT;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we characterize the behavior of canonical value-based deep reinforcement learning (RL) approaches under varying degrees of non-stationarity. In particular, we demonstrate that deep RL agents lose their ability to learn good policies when they cycle through a sequence of Atari 2600 games. This phenomenon is alluded to in prior work under various guises-e.g., loss of plasticity, implicit under-parameterization, primacy bias, and capacity loss. We investigate this phenomenon closely at scale and analyze how the weights, gradients, and activations change over time in several experiments with varying experimental conditions (e.g., similarity between games, number of games, number of frames per game). Our analysis shows that the activation footprint of the network becomes sparser, contributing to the diminishing gradients. We investigate a remarkably simple mitigation strategy-Concatenated ReLUs (CReLUs) activation function-and demonstrate its effectiveness in facilitating continual learning in a changing environment.
引用
收藏
页码:620 / 636
页数:17
相关论文
共 50 条
  • [1] Loss of plasticity in deep continual learning
    Dohare, Shibhansh
    Hernandez-Garcia, J. Fernando
    Lan, Qingfeng
    Rahman, Parash
    Mahmood, A. Rupam
    Sutton, Richard S.
    NATURE, 2024, 632 (8026) : 768 - 774
  • [2] Deep Reinforcement Learning with Plasticity Injection
    Nikishin, Evgenii
    Oh, Junhyuk
    Ostrovski, Georg
    Lyle, Clare
    Pascanu, Razvan
    Dabney, Will
    Barreto, Andre
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [3] Solving Continual Combinatorial Selection via Deep Reinforcement Learning
    Song, Hyungseok
    Jang, Hyeryung
    Tran, Hai H.
    Yoon, Se-eun
    Son, Kyunghwan
    Yun, Donggyu
    Chung, Hyoju
    Yi, Yung
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 3467 - 3474
  • [4] Continual Deep Reinforcement Learning to Prevent Catastrophic Forgetting in Jamming Mitigation
    Davaslioglu, Kemal
    Kompella, Sastry
    Erpek, Tugba
    Sagduyu, Yalin E.
    arXiv,
  • [5] Continual deep reinforcement learning with task-agnostic policy distillation
    Hafez, Muhammad Burhan
    Erekmen, Kerim
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [6] Deep Reinforcement Learning amidst Continual Structured Non-Stationarity
    Xie, Annie
    Harrison, James
    Finn, Chelsea
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [7] Continual Deep Reinforcement Learning with Task-Agnostic Policy Distillation
    Hafez, Muhammad Burhan
    Erekmen, Kerim
    arXiv,
  • [8] Continual Deep Reinforcement Learning to Prevent Catastrophic Forgetting in Jamming Mitigation
    Nexcepta, Gaithersburg
    MD, United States
    Proc IEEE Mil Commun Conf MILCOM, 2024, (740-745):
  • [9] Continual learning, deep reinforcement learning, and microcircuits: a novel method for clever game playing
    Chang O.
    Ramos L.
    Morocho-Cayamcela M.E.
    Armas R.
    Zhinin-Vera L.
    Multimedia Tools and Applications, 2025, 84 (3) : 1537 - 1559
  • [10] Continual World: A Robotic Benchmark For Continual Reinforcement Learning
    Wolczyk, Maciej
    Zajac, Michal
    Pascanu, Razvan
    Kucinski, Lukasz
    Milos, Piotr
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34