Dopamine transients do not act as model-free prediction errors during associative learning

被引：39

作者：

Sharpe, Melissa J. ^{[1
,2
,3
,4
]}

Batchelor, Hannah M. ^{[1
]}

Mueller, Lauren E. ^{[1
]}

Chang, Chun Yun ^{[1
]}

Maes, Etienne J. P. ^{[1
]}

Niv, Yael ^{[2
,5
]}

Schoenbaum, Geoffrey ^{[1
,6
,7
,8
]}

机构：

[1] NIDA, Intramural Res Program, Baltimore, MD 21224 USA

[2] Princeton Univ, Princeton Neurosci Inst, Princeton, NJ 08544 USA

[3] UNSW, Sch Psychol, Sydney, NSW, Australia

[4] Univ Calif Los Angeles, Dept Psychol, Los Angeles, CA 90095 USA

[5] Princeton Univ, Psychol Dept, Princeton, NJ 08544 USA

[6] Univ Maryland, Sch Med, Dept Anat & Neurobiol, Baltimore, MD 21201 USA

[7] Univ Maryland, Sch Med, Dept Psychiat, Baltimore, MD 21201 USA

[8] Johns Hopkins Univ, Solomon H Snyder Dept Neurosci, Baltimore, MD 21287 USA

来源：

NATURE COMMUNICATIONS | 2020年 / 11卷 / 01期

关键词：

ORBITOFRONTAL CORTEX; REINFORCEMENT; ACQUISITION; BEHAVIOR; RELEASE; SUFFICIENT; PSYCHOSIS; NEURONS;

D O I：

10.1038/s41467-019-13953-1

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Dopamine neurons are proposed to signal the reward prediction error in model-free reinforcement learning algorithms. This term represents the unpredicted or 'excess' value of the rewarding event, value that is then added to the intrinsic value of any antecedent cues, contexts or events. To support this proposal, proponents cite evidence that artificially-induced dopamine transients cause lasting changes in behavior. Yet these studies do not generally assess learning under conditions where an endogenous prediction error would occur. Here, to address this, we conducted three experiments where we optogenetically activated dopamine neurons while rats were learning associative relationships, both with and without reward. In each experiment, the antecedent cues failed to acquire value and instead entered into associations with the later events, whether valueless cues or valued rewards. These results show that in learning situations appropriate for the appearance of a prediction error, dopamine transients support associative, rather than model-free, learning.

引用

页数：10

共 39 条

[31] Adaptable reservoir computing: A paradigm for model-free data-driven prediction of critical transitions in nonlinear dynamical systems
Panahi, Shirin
Lai, Ying-Cheng
CHAOS, 2024, 34 (05)
[32] Overlapping Prediction Errors in Dorsal Striatum During Instrumental Learning With Juice and Money Reward in the Human Brain
Valentin, Vivian V.
O'Doherty, John P.
JOURNAL OF NEUROPHYSIOLOGY, 2009, 102 (06) : 3384 - 3391
[33] Transcranial Direct Current Stimulation of Right Dorsolateral Prefrontal Cortex Does Not Affect Model-Based or Model-Free Reinforcement Learning in Humans
Smittenaar, Peter
Prichard, George
FitzGerald, Thomas H. B.
Diedrichsen, Joern
Dolan, Raymond J.
PLOS ONE, 2014, 9 (01):
[34] MF∧2: Model-free reinforcement learning for modeling-free building HVAC control with data-driven environment construction in a residential building
Wang, Man
Lin, Borong
BUILDING AND ENVIRONMENT, 2023, 244
[35] A mechanistic model of ADHD as resulting from dopamine phasic/tonic imbalance during reinforcement learning
Veronneau-Veilleux, Florence
Robaey, Philippe
Ursino, Mauro
Nekka, Fahima
FRONTIERS IN COMPUTATIONAL NEUROSCIENCE, 2022, 16
[36] Local minimization of prediction errors drives learning of invariant object representations in a generative network model of visual perception
Brucklacher, Matthias
Bohte, Sander M.
Mejias, Jorge F.
Pennartz, Cyriel M. A.
FRONTIERS IN COMPUTATIONAL NEUROSCIENCE, 2023, 17
[37] Two-dimensional model-free Q-learning-based output feedback fault-tolerant control for batch processes
Shi, Huiyuan
Gao, Wei
Jiang, Xueying
Su, Chengli
Li, Ping
COMPUTERS & CHEMICAL ENGINEERING, 2024, 182
[38] A novel technique for delineating the effect of variation in the learning rate on the neural correlates of reward prediction errors in model-based fMRI
Chase, Henry W.
FRONTIERS IN PSYCHOLOGY, 2023, 14
[39] Online Visual Feedback during Error-Free Channel Trials Leads to Active Unlearning of Movement Dynamics: Evidence for Adaptation to Trajectory Prediction Errors
Lago-Rodriguez, Angel
Miall, R. Chris
FRONTIERS IN HUMAN NEUROSCIENCE, 2016, 10

← 1 2 3 4 →