Temporal sequence learning, prediction, and control:: A review of different models and their relation to biological mechanisms

被引:129
作者
Wörgötter, F [1 ]
Porr, B [1 ]
机构
[1] Univ Stirling, Dept Psychol, Stirling FK9 4LA, Scotland
关键词
D O I
10.1162/0899766053011555
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this review, we compare methods for temporal sequence learning (TSL) across the disciplines machine-control, classical conditioning, neuronal models for TSL as well as spike-timing-dependent plasticity (STDP). This review introduces the most influential models and focuses on two questions: To what degree are reward-based (e.g., TD learning) and correlation-based (Hebbian) learning related? and How do the different models correspond to possibly underlying biological mechanisms of synaptic plasticity? We first compare the different models in an open-loop condition, where behavioral feedback does not alter the learning. Here we observe that reward-based and correlation-based learning are indeed very similar. Machine control is then used to introduce the problem of closed-loop control (e.g., actor-critic architectures). Here the problem of evaluative (rewards) versus nonevaluative (correlations) feedback from the environment will be discussed, showing that both learning approaches are fundamentally different in the closed-loop condition. In trying to answer the second question, we compare neuronal versions of the different learning architectures to the anatomy of the involved brain structures (basal-ganglia, thalamus, and cortex) and the molecular biophysics of glutamatergic and dopaminergic synapses. Finally, we discuss the different algorithms used to model STDP and compare them to reward-based learning rules. Certain similarities are found in spite of the strongly different timescales. Here we focus on the biophysics of the different calcium-release mechanisms known to be involved in STDP.
引用
收藏
页码:245 / 319
页数:75
相关论文
共 276 条
[81]  
GERFEN CR, 1992, ANNU REV NEUROSCI, V15, P285, DOI 10.1146/annurev.ne.15.030192.001441
[82]   Mathematical formulations of Hebbian learning [J].
Gerstner, W ;
Kistler, WM .
BIOLOGICAL CYBERNETICS, 2002, 87 (5-6) :404-415
[83]   A neuronal learning rule for sub-millisecond temporal coding [J].
Gerstner, W ;
Kempter, R ;
vanHemmen, JL ;
Wagner, H .
NATURE, 1996, 383 (6595) :76-78
[84]   A problem with Hebb and local spikes [J].
Goldberg, J ;
Holthoff, K ;
Yuste, R .
TRENDS IN NEUROSCIENCES, 2002, 25 (09) :433-435
[85]   Dendritic spikes as a mechanism for cooperative long-term potentiation [J].
Golding, NL ;
Staff, NP ;
Spruston, N .
NATURE, 2002, 418 (6895) :326-331
[86]   Dichotomy of action-potential backpropagation in CA1 pyramidal neuron dendrites [J].
Golding, NL ;
Kath, WL ;
Spruston, N .
JOURNAL OF NEUROPHYSIOLOGY, 2001, 86 (06) :2998-3010
[87]  
GORMEZANO I, 1983, PROGR PSYCHOBIOLOGY, P198
[88]   L-type calcium channels and GSK-3 regulate the activity of NF-ATc4 in hippocampal neurons [J].
Graef, IA ;
Mermelstein, PG ;
Stankunas, K ;
Neilson, JR ;
Deisseroth, K ;
Tsien, RW ;
Crabtree, GR .
NATURE, 1999, 401 (6754) :703-708
[89]   Beyond the dopamine receptor: the DARPP-32/Protein phosphatase-1 cascade [J].
Greengard, P ;
Allen, PB ;
Nairn, AC .
NEURON, 1999, 23 (03) :435-447
[90]  
GROENEWEGEN HJ, 1990, PROG BRAIN RES, V85, P95