Strategically managing learning during perceptual decision making

被引:8
作者
Masis, Javier [1 ,2 ,4 ]
Chapman, Travis [2 ]
Rhee, Juliana Y. [1 ,2 ,5 ]
Cox, David D. [1 ,2 ,6 ]
Saxe, Andrew M. [3 ,7 ,8 ]
机构
[1] Harvard Univ, Dept Mol & Cellular Biol, Cambridge, MA 02138 USA
[2] Harvard Univ, Ctr Brain Sci, Cambridge, MA 02138 USA
[3] Univ Oxford, Dept Expt Psychol, Oxford, England
[4] Princeton Univ, Princeton Neurosci Inst, INSERM, Princeton, NJ USA
[5] Rockefeller Univ, New York, NY USA
[6] MIT IBM Watson Al Lab, Cambridge, MA USA
[7] UCL, Gatsby Unit, London, England
[8] UCL, Sainsbury Wellcome Ctr, London, England
基金
英国惠康基金;
关键词
learning; decision making; neural networks; behavior; cognitive control; inter-temporal choice; Rat; SPEED-ACCURACY TRADEOFF; REACTION-TIME DISTRIBUTIONS; COMPUTATIONAL RATIONALITY; REWARD RATE; MODEL; CHOICE; DISCRIMINATION; ACCOUNT; CORTEX; OPTIMALITY;
D O I
10.7554/eLife.64978
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Making optimal decisions in the face of noise requires balancing short-term speed and accuracy. But a theory of optimality should account for the fact that short-term speed can influence long-term accuracy through learning. Here, we demonstrate that long-term learning is an important dynamical dimension of the speed-accuracy trade-off. We study learning trajectories in rats and formally characterize these dynamics in a theory expressed as both a recurrent neural network and an analytical extension of the drift-diffusion model that learns over time. The model reveals that choosing suboptimal response times to learn faster sacrifices immediate reward, but can lead to greater total reward. We empirically verify predictions of the theory, including a relationship between stimulus exposure and learning speed, and a modulation of reaction time by future learning prospects. We find that rats' strategies approximately maximize total reward over the full learning epoch, suggesting cognitive control over the learning process.
引用
收藏
页数:43
相关论文
共 126 条
[91]   Speed-accuracy tradeoff in olfaction [J].
Rinberg, Dmitry ;
Koulakov, Alexei ;
Gelperin, Alan .
NEURON, 2006, 51 (03) :351-358
[92]  
Roitman JD, 2002, J NEUROSCI, V22, P9475
[93]   An evaluation of the Vincentizing method of forming group-level response time distributions [J].
Rouder, JN ;
Speckman, PL .
PSYCHONOMIC BULLETIN & REVIEW, 2004, 11 (03) :419-427
[94]   Extracting the dynamics of behavior in sensory decision-making experiments [J].
Roy, Nicholas A. ;
Bak, Ji Hyun ;
Akrami, Athena ;
Brody, Carlos D. ;
Pillow, Jonathan W. .
NEURON, 2021, 109 (04) :597-610.e6
[95]  
Russell SJ, 1994, J ARTIF INTELL RES, V2, P575
[96]   A test of the deadline model for speed-accuracy tradeoffs [J].
Ruthruff, E .
PERCEPTION & PSYCHOPHYSICS, 1996, 58 (01) :56-64
[97]   If deep learning is the answer, what is the question? [J].
Saxe, Andrew ;
Nelli, Stephanie ;
Summerfield, Christopher .
NATURE REVIEWS NEUROSCIENCE, 2021, 22 (01) :55-67
[98]   Sources of noise during accumulation of evidence in unrestrained and voluntarily head-restrained rat [J].
Scott, Benjamin B. ;
Constantinople, Christine M. ;
Erlich, Jeffrey C. ;
Tank, David W. ;
Brody, Carlos D. .
ELIFE, 2015, 4
[99]   Toward a Rational and Mechanistic Account of Mental Effort [J].
Shenhav, Amitai ;
Musslick, Sebastian ;
Lieder, Falk ;
Kool, Wouter ;
Griffiths, Thomas L. ;
Cohen, Jonathan D. ;
Botvinick, Matthew M. .
ANNUAL REVIEW OF NEUROSCIENCE, VOL 40, 2017, 40 :99-124
[100]   The Expected Value of Control: An Integrative Theory of Anterior Cingulate Cortex Function [J].
Shenhav, Amitai ;
Botvinick, Matthew M. ;
Cohen, Jonathan D. .
NEURON, 2013, 79 (02) :217-240