Strategically managing learning during perceptual decision making

被引：8

作者：

Masis, Javier ^{[1
,2
,4
]}

Chapman, Travis ^{[2
]}

Rhee, Juliana Y. ^{[1
,2
,5
]}

Cox, David D. ^{[1
,2
,6
]}

Saxe, Andrew M. ^{[3
,7
,8
]}

机构：

[1] Harvard Univ, Dept Mol & Cellular Biol, Cambridge, MA 02138 USA

[2] Harvard Univ, Ctr Brain Sci, Cambridge, MA 02138 USA

[3] Univ Oxford, Dept Expt Psychol, Oxford, England

[4] Princeton Univ, Princeton Neurosci Inst, INSERM, Princeton, NJ USA

[5] Rockefeller Univ, New York, NY USA

[6] MIT IBM Watson Al Lab, Cambridge, MA USA

[7] UCL, Gatsby Unit, London, England

[8] UCL, Sainsbury Wellcome Ctr, London, England

来源：

ELIFE | 2023年 / 12卷

基金：

英国惠康基金;

关键词：

learning; decision making; neural networks; behavior; cognitive control; inter-temporal choice; Rat; SPEED-ACCURACY TRADEOFF; REACTION-TIME DISTRIBUTIONS; COMPUTATIONAL RATIONALITY; REWARD RATE; MODEL; CHOICE; DISCRIMINATION; ACCOUNT; CORTEX; OPTIMALITY;

D O I：

10.7554/eLife.64978

中图分类号：

Q [生物科学];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Making optimal decisions in the face of noise requires balancing short-term speed and accuracy. But a theory of optimality should account for the fact that short-term speed can influence long-term accuracy through learning. Here, we demonstrate that long-term learning is an important dynamical dimension of the speed-accuracy trade-off. We study learning trajectories in rats and formally characterize these dynamics in a theory expressed as both a recurrent neural network and an analytical extension of the drift-diffusion model that learns over time. The model reveals that choosing suboptimal response times to learn faster sacrifices immediate reward, but can lead to greater total reward. We empirically verify predictions of the theory, including a relationship between stimulus exposure and learning speed, and a modulation of reaction time by future learning prospects. We find that rats' strategies approximately maximize total reward over the full learning epoch, suggesting cognitive control over the learning process.

引用

页数：43

共 126 条

[91] Speed-accuracy tradeoff in olfaction [J].