Reinforcement Learning in Multidimensional Environments Relies on Attention Mechanisms

被引:244
作者
Niv, Yael [1 ,2 ]
Daniel, Reka [1 ,2 ]
Geana, Andra [1 ,2 ]
Gershman, Samuel J. [3 ]
Leong, Yuan Chang [4 ]
Radulescu, Angela [1 ,2 ]
Wilson, Robert C. [5 ,6 ]
机构
[1] Princeton Univ, Dept Psychol, Princeton, NJ 08540 USA
[2] Princeton Univ, Inst Neurosci, Princeton, NJ 08540 USA
[3] MIT, Dept Brain & Cognit Sci, Cambridge, MA 02139 USA
[4] Stanford Univ, Dept Psychol, Stanford, CA 94305 USA
[5] Univ Arizona, Dept Psychol, Tucson, AZ 85721 USA
[6] Univ Arizona, Cognit Sci Program, Tucson, AZ 85721 USA
关键词
attention; fMRI; frontoparietal network; model comparison; reinforcement learning; representation learning; PREFRONTAL CORTEX; PREDICTION ERRORS; SELECTIVE ATTENTION; COGNITIVE FUNCTIONS; PARKINSONS-DISEASE; NEURAL MECHANISMS; FRONTAL-CORTEX; MODELS; TASK; CATEGORIZATION;
D O I
10.1523/JNEUROSCI.2978-14.2015
中图分类号
Q189 [神经科学];
学科分类号
071006 ;
摘要
In recent years, ideas from the computational field of reinforcement learning have revolutionized the study of learning in the brain, famously providing new, precise theories of how dopamine affects learning in the basal ganglia. However, reinforcement learning algorithms are notorious for not scaling well to multidimensional environments, as is required for real-world learning. We hypothesized that the brain naturally reduces the dimensionality of real-world problems to only those dimensions that are relevant to predicting reward, and conducted an experiment to assess by what algorithms and with what neural mechanisms this "representation learning" process is realized in humans. Our results suggest that a bilateral attentional control network comprising the intraparietal sulcus, precuneus, and dorsolateral prefrontal cortex is involved in selecting what dimensions are relevant to the task at hand, effectively updating the task representation through trial and error. In this way, cortical attention mechanisms interact with learning in the basal ganglia to solve the "curse of dimensionality" in reinforcement learning.
引用
收藏
页码:8145 / 8157
页数:13
相关论文
共 50 条
[31]   Enhancing reinforcement learning for de novo molecular design applying self-attention mechanisms [J].
Pereira, Tiago O. ;
Abbasi, Maryam ;
Arrais, Joel P. .
BRIEFINGS IN BIOINFORMATICS, 2023, 24 (06)
[32]   Preparatory Attention Relies on Dynamic Interactions between Prelimbic Cortex and Anterior Cingulate Cortex [J].
Totah, Nelson K. B. ;
Jackson, Mark E. ;
Moghaddam, Bita .
CEREBRAL CORTEX, 2013, 23 (03) :729-738
[33]   Mechanisms of value-learning in the guidance of spatial attention [J].
Anderson, Brian A. ;
Kim, Haena .
COGNITION, 2018, 178 :26-36
[34]   Reinforcement Learning Applied to PSO for Multidimensional Knapsack Problem [J].
Olivares, Rodrigo ;
Rios, Victor ;
Olivares, Pablo ;
Serrano, Benjamin .
MACHINE LEARNING METHODS IN SYSTEMS, VOL 4, CSOC 2024, 2024, 1126 :375-382
[35]   Testing the reinforcement learning hypothesis of social conformity [J].
Levorsen, Marie ;
Ito, Ayahito ;
Suzuki, Shinsuke ;
Izuma, Keise .
HUMAN BRAIN MAPPING, 2021, 42 (05) :1328-1342
[36]   Using Reinforcement Learning to Examine Dynamic Attention Allocation During Reading [J].
Liu, Yanping ;
Reichle, Erik D. ;
Gao, Ding-Guo .
COGNITIVE SCIENCE, 2013, 37 (08) :1507-1540
[37]   Generalized attention-weighted reinforcement learning [J].
Bramlage, Lennart ;
Cortese, Aurelio .
NEURAL NETWORKS, 2022, 145 :10-21
[38]   Inverse Reinforcement Learning in Partially Observable Environments [J].
Choi, Jaedeug ;
Kim, Kee-Eung .
JOURNAL OF MACHINE LEARNING RESEARCH, 2011, 12 :691-730
[39]   Reinforcement Learning with Symbiotic Relationships for Multiagent Environments [J].
Mabu, Shingo ;
Obayashi, Masanao ;
Kuremoto, Takashi .
PROCEEDINGS OF INTERNATIONAL CONFERENCE ON ARTIFICIAL LIFE AND ROBOTICS (ICAROB2015), 2015, :102-106
[40]   Reinforcement Learning for Robot Navigation in Nondeterministic Environments [J].
Liu, Xiaoyun ;
Zhou, Qingrui ;
Ren, Hailin ;
Sun, Changhao .
PROCEEDINGS OF 2018 5TH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS), 2018, :615-619