Temporal sequence learning, prediction, and control:: A review of different models and their relation to biological mechanisms

被引：129

作者：

Wörgötter, F ^{[1
]}

Porr, B ^{[1
]}

机构：

[1] Univ Stirling, Dept Psychol, Stirling FK9 4LA, Scotland

来源：

NEURAL COMPUTATION | 2005年 / 17卷 / 02期

关键词：

D O I：

10.1162/0899766053011555

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this review, we compare methods for temporal sequence learning (TSL) across the disciplines machine-control, classical conditioning, neuronal models for TSL as well as spike-timing-dependent plasticity (STDP). This review introduces the most influential models and focuses on two questions: To what degree are reward-based (e.g., TD learning) and correlation-based (Hebbian) learning related? and How do the different models correspond to possibly underlying biological mechanisms of synaptic plasticity? We first compare the different models in an open-loop condition, where behavioral feedback does not alter the learning. Here we observe that reward-based and correlation-based learning are indeed very similar. Machine control is then used to introduce the problem of closed-loop control (e.g., actor-critic architectures). Here the problem of evaluative (rewards) versus nonevaluative (correlations) feedback from the environment will be discussed, showing that both learning approaches are fundamentally different in the closed-loop condition. In trying to answer the second question, we compare neuronal versions of the different learning architectures to the anatomy of the involved brain structures (basal-ganglia, thalamus, and cortex) and the molecular biophysics of glutamatergic and dopaminergic synapses. Finally, we discuss the different algorithms used to model STDP and compare them to reward-based learning rules. Certain similarities are found in spite of the strongly different timescales. Here we focus on the biophysics of the different calcium-release mechanisms known to be involved in STDP.

引用

页码：245 / 319

页数：75

共 276 条

[81]

GERFEN CR, 1992, ANNU REV NEUROSCI, V15, P285, DOI 10.1146/annurev.ne.15.030192.001441

[82] Mathematical formulations of Hebbian learning [J].

Gerstner, W ;

Kistler, WM .

BIOLOGICAL CYBERNETICS, 2002, 87 (5-6) :404-415

[83] A neuronal learning rule for sub-millisecond temporal coding [J].

Gerstner, W ;

Kempter, R ;

vanHemmen, JL ;

Wagner, H .

NATURE, 1996, 383 (6595) :76-78

[84] A problem with Hebb and local spikes [J].

Goldberg, J ;

Holthoff, K ;

Yuste, R .

TRENDS IN NEUROSCIENCES, 2002, 25 (09) :433-435

[85] Dendritic spikes as a mechanism for cooperative long-term potentiation [J].

Golding, NL ;

Staff, NP ;

Spruston, N .

NATURE, 2002, 418 (6895) :326-331

[86] Dichotomy of action-potential backpropagation in CA1 pyramidal neuron dendrites [J].

Golding, NL ;

Kath, WL ;

Spruston, N .

JOURNAL OF NEUROPHYSIOLOGY, 2001, 86 (06) :2998-3010

[87]

GORMEZANO I, 1983, PROGR PSYCHOBIOLOGY, P198

[88] L-type calcium channels and GSK-3 regulate the activity of NF-ATc4 in hippocampal neurons [J].

Graef, IA ;

Mermelstein, PG ;

Stankunas, K ;

Neilson, JR ;

Deisseroth, K ;

Tsien, RW ;

Crabtree, GR .

NATURE, 1999, 401 (6754) :703-708

[89] Beyond the dopamine receptor: the DARPP-32/Protein phosphatase-1 cascade [J].

Greengard, P ;

Allen, PB ;

Nairn, AC .

NEURON, 1999, 23 (03) :435-447

[90]

GROENEWEGEN HJ, 1990, PROG BRAIN RES, V85, P95

← 4 5 6 7 8 9 10 11 12 13 →