Combining backpropagation with Equilibrium Propagation to improve an Actor-Critic reinforcement learning framework

被引:5
作者
Kubo, Yoshimasa [1 ]
Chalmers, Eric [2 ]
Luczak, Artur [1 ]
机构
[1] Univ Lethbridge, Canadian Ctr Behav Neurosci, Lethbridge, AB, Canada
[2] Mt Royal Univ, Dept Math & Comp, Calgary, AB, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Equilibrium Propagation; Actor-Critic (AC); biologically plausible; reinforcement learning; backpropagation;
D O I
10.3389/fncom.2022.980613
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Backpropagation (BP) has been used to train neural networks for many years, allowing them to solve a wide variety of tasks like image classification, speech recognition, and reinforcement learning tasks. But the biological plausibility of BP as a mechanism of neural learning has been questioned. Equilibrium Propagation (EP) has been proposed as a more biologically plausible alternative and achieves comparable accuracy on the CIFAR-10 image classification task. This study proposes the first EP-based reinforcement learning architecture: an Actor-Critic architecture with the actor network trained by EP. We show that this model can solve the basic control tasks often used as benchmarks for BP-based models. Interestingly, our trained model demonstrates more consistent high-reward behavior than a comparable model trained exclusively by BP.
引用
收藏
页数:8
相关论文
共 40 条
[1]  
Almeida L. B., 1987, IEEE First International Conference on Neural Networks, P609
[2]   Contrastive Learning and Neural Oscillations [J].
Baldi, Pierre ;
Pineda, Fernando .
NEURAL COMPUTATION, 1991, 3 (04) :526-545
[3]   The Arcade Learning Environment: An Evaluation Platform for General Agents [J].
Bellemare, Marc G. ;
Naddaf, Yavar ;
Veness, Joel ;
Bowling, Michael .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2013, 47 :253-279
[4]  
Brockman G, 2016, Arxiv, DOI arXiv:1606.01540
[5]  
Chalmers E., 2022, ARXIV
[6]  
Chung SP, 2021, Arxiv, DOI arXiv:2010.07893
[7]  
Ernoult M, 2019, ADV NEUR IN, V32
[8]  
Goodfellow IJ, 2015, Arxiv, DOI [arXiv:1412.6572, DOI 10.48550/ARXIV.1412.6572]
[9]   Actor-critic models of the basal ganglia: new anatomical and computational perspectives [J].
Joel, D ;
Niv, Y ;
Ruppin, E .
NEURAL NETWORKS, 2002, 15 (4-6) :535-547
[10]   Large-scale Video Classification with Convolutional Neural Networks [J].
Karpathy, Andrej ;
Toderici, George ;
Shetty, Sanketh ;
Leung, Thomas ;
Sukthankar, Rahul ;
Fei-Fei, Li .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :1725-1732