Beyond dichotomies in reinforcement learning

被引:73
作者
Collins, Anne G. E. [1 ,2 ]
Cockburn, Jeffrey [3 ]
机构
[1] Univ Calif Berkeley, Dept Psychol, 3210 Tolman Hall, Berkeley, CA 94720 USA
[2] Univ Calif Berkeley, Helen Wills Neurosci Inst, Berkeley, CA 94720 USA
[3] CALTECH, Div Humanities & Social Sci, Pasadena, CA 91125 USA
关键词
DOPAMINE NEURONS ENCODE; MODEL-BASED CONTROL; PREFRONTAL CORTEX; WORKING-MEMORY; INDIVIDUAL-DIFFERENCES; PREDICTION ERRORS; DECISION-MAKING; BASAL GANGLIA; SYSTEMS; REWARD;
D O I
10.1038/s41583-020-0355-6
中图分类号
Q189 [神经科学];
学科分类号
071006 ;
摘要
Reinforcement learning (RL) is a framework of particular importance to psychology, neuroscience and machine learning. Interactions between these fields, as promoted through the common hub of RL, has facilitated paradigm shifts that relate multiple levels of analysis in a singular framework (for example, relating dopamine function to a computationally defined RL signal). Recently, more sophisticated RL algorithms have been proposed to better account for human learning, and in particular its oft-documented reliance on two separable systems: a model-based (MB) system and a model-free (MF) system. However, along with many benefits, this dichotomous lens can distort questions, and may contribute to an unnecessarily narrow perspective on learning and decision-making. Here, we outline some of the consequences that come from overconfidently mapping algorithms, such as MB versus MF RL, with putative cognitive processes. We argue that the field is well positioned to move beyond simplistic dichotomies, and we propose a means of refocusing research questions towards the rich and complex components that comprise learning and decision-making. Reinforcement learning has been suggested to come in two flavours: model-free and model-based. In this Perspective, Collins and Cockburn explain why viewing reinforcement learning through this dichotomous lens is not always accurate or helpful, and suggest paths forward.
引用
收藏
页码:576 / 586
页数:11
相关论文
共 158 条
[51]   MOTIVATIONAL CONTROL OF GOAL-DIRECTED ACTION [J].
DICKINSON, A ;
BALLEINE, B .
ANIMAL LEARNING & BEHAVIOR, 1994, 22 (01) :1-18
[52]   Variability in Dopamine Genes Dissociates Model-Based and Model-Free Reinforcement Learning [J].
Doll, Bradley B. ;
Bath, Kevin G. ;
Daw, Nathaniel D. ;
Frank, Michael J. .
JOURNAL OF NEUROSCIENCE, 2016, 36 (04) :1211-1222
[53]   Model-based choices involve prospective neural activity [J].
Doll, Bradley B. ;
Duncan, Katherine D. ;
Simon, Dylan A. ;
Shohamy, Daphna ;
Daw, Nathaniel D. .
NATURE NEUROSCIENCE, 2015, 18 (05) :767-+
[54]   Multiple memory systems as substrates for multiple decision systems [J].
Doll, Bradley B. ;
Shohamy, Daphna ;
Daw, Nathaniel D. .
NEUROBIOLOGY OF LEARNING AND MEMORY, 2015, 117 :4-13
[55]   Reduced susceptibility to confirmation bias in schizophrenia [J].
Doll, Bradley B. ;
Waltz, James A. ;
Cockburn, Jeffrey ;
Brown, Jaime K. ;
Frank, Michael J. ;
Gold, James M. .
COGNITIVE AFFECTIVE & BEHAVIORAL NEUROSCIENCE, 2014, 14 (02) :715-728
[56]   The ubiquity of model-based reinforcement learning [J].
Doll, Bradley B. ;
Simon, Dylan A. ;
Daw, Nathaniel D. .
CURRENT OPINION IN NEUROBIOLOGY, 2012, 22 (06) :1075-1081
[57]   Dopaminergic Genes Predict Individual Differences in Susceptibility to Confirmation Bias [J].
Doll, Bradley B. ;
Hutchison, Kent E. ;
Frank, Michael J. .
JOURNAL OF NEUROSCIENCE, 2011, 31 (16) :6188-6198
[58]   Foundations of human reasoning in the prefrontal cortex [J].
Donoso, Mael ;
Collins, Anne G. E. ;
Koechlin, Etienne .
SCIENCE, 2014, 344 (6191) :1481-1486
[59]   What are the computations of the cerebellum, the basal ganglia and the cerebral cortex? [J].
Doya, K .
NEURAL NETWORKS, 1999, 12 (7-8) :961-974
[60]   Model-Based Reasoning in Humans Becomes Automatic with Training [J].
Economides, Marcos ;
Kurth-Nelson, Zeb ;
Luebbert, Annika ;
Guitart-Masip, Marc ;
Dolan, Raymond J. .
PLOS COMPUTATIONAL BIOLOGY, 2015, 11 (09)