Working Memory Load Strengthens Reward Prediction Errors

被引：75

作者：

Collins, Anne G. E. ^{[1
,2
,3
]}

Ciullo, Brittany ^{[3
]}

Frank, Michael J. ^{[3
,4
]}

Badre, David ^{[3
,4
]}

机构：

[1] Univ Calif Berkeley, Dept Psychol, 3210 Tolman Hall, Berkeley, CA 94720 USA

[2] Univ Calif Berkeley, Helen Wills Neurosci Inst, Berkeley, CA 94720 USA

[3] Brown Univ, Dept Cognit Linguist & Psychol Sci, Providence, RI 02912 USA

[4] Brown Univ, Brown Inst Brain Sci, Providence, RI 02912 USA

来源：

JOURNAL OF NEUROSCIENCE | 2017年 / 37卷 / 16期

基金：

美国国家卫生研究院; 美国国家科学基金会;

关键词：

fMRI; reinforcement learning; reward prediction error; working memory; REINFORCEMENT LEARNING SIGNALS; PREFRONTAL CORTEX; DOPAMINE; SYSTEMS; CHOICES; ORGANIZATION; NEUROSCIENCE; MECHANISMS; STRIATUM; BEHAVIOR;

D O I：

10.1523/JNEUROSCI.2700-16.2017

中图分类号：

Q189 [神经科学];

学科分类号：

071006 ;

摘要：

Reinforcement learning (RL) in simple instrumental tasks is usually modeled as a monolithic process in which reward prediction errors (RPEs) are used to update expected values of choice options. This modeling ignores the different contributions of different memory and decision-making systems thought to contribute even to simple learning. In an fMRI experiment, we investigated how working memory (WM) and incremental RL processes interact to guide human learning. WMload was manipulated by varying the number of stimuli to be learned across blocks. Behavioral results and computational modeling confirmed that learning was best explained as a mixture of two mechanisms: a fast, capacity-limited, and delay-sensitive WM process together with slower RL. Model-based analysis of fMRI data showed that striatum and lateral prefrontal cortex were sensitive to RPE, as shown previously, but, critically, these signals were reduced when the learning problem was within capacity ofWM. The degree of this neural interaction related to individual differences in the use of WM to guide behavioral learning. These results indicate that the two systems do not process information independently, but rather interact during learning.

引用

页码：4332 / 4342

页数：11

共 36 条

[1]

[Anonymous], 2002, Model selection and multimodel inference: a practical informationtheoretic approach

[2]

[Anonymous], 2016, ARXIV160608199

[3] Functional magnetic resonance imaging evidence for a hierarchical organization of the prefrontal cortex [J].

Badre, David ;

D'Esposito, Mark .

JOURNAL OF COGNITIVE NEUROSCIENCE, 2007, 19 (12) :2082-2099

[4] Mechanisms of Hierarchical Reinforcement Learning in Cortico-Striatal Circuits 2: Evidence from fMRI [J].

Badre, David ;

Frank, Michael J. .

CEREBRAL CORTEX, 2012, 22 (03) :527-536

[5] Dopamine neuron systems in the brain:: an update [J].

Bjorklund, Anders ;

Dunnett, Stephen B. .

TRENDS IN NEUROSCIENCES, 2007, 30 (05) :194-202

[6] Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective [J].

Botvinick, Matthew M. ;

Niv, Yael ;

Barto, Andrew C. .

COGNITION, 2009, 113 (03) :262-280

[7] Reasoning, Learning, and Creativity: Frontal Lobe Function and Human Decision-Making [J].

Collins, Anne ;

Koechlin, Etienne .

PLOS BIOLOGY, 2012, 10 (03)

[8] Working Memory Contributions to Reinforcement Learning Impairments in Schizophrenia [J].

Collins, Anne G. E. ;

Brown, Jaime K. ;

Gold, James M. ;

Waltz, James A. ;

Frank, Michael J. .

JOURNAL OF NEUROSCIENCE, 2014, 34 (41) :13747-13756

[9] Opponent Actor Learning (OpAL): Modeling Interactive Effects of Striatal Dopamine on Reinforcement Learning and Choice Incentive [J].

Collins, Anne G. E. ;

Frank, Michael J. .

PSYCHOLOGICAL REVIEW, 2014, 121 (03) :337-366

[10] Cognitive Control Over Learning: Creating, Clustering, and Generalizing Task-Set Structure [J].

Collins, Anne G. E. ;

Frank, Michael J. .

PSYCHOLOGICAL REVIEW, 2013, 120 (01) :190-229

← 1 2 3 4 →