When facing an unfamiliar environment, animals need to explore to gain new knowledge about which actions provide reward, but also put the newly acquired knowledge to use as quickly as possible. Optimal reinforcement learning strategies should therefore assess the uncertainties of these action-reward associations and utilise them to inform decision making. We propose a novel model whereby direct and indirect striatal pathways act together to estimate both the mean and variance of reward distributions, and mesolimbic dopaminergic neurons provide transient novelty signals, facilitating effective uncertainty-driven exploration. We utilised electrophysiological recording data to verify our model of the basal ganglia, and we fitted exploration strategies derived from the neural model to data from behavioural experiments. We also compared the performance of directed exploration strategies inspired by our basal ganglia model with other exploration algorithms including classic variants of upper confidence bound (UCB) strategy in simulation. The exploration strategies inspired by the basal ganglia model can achieve overall superior performance in simulation, and we found qualitatively similar results in fitting model to behavioural data compared with the fitting of more idealised normative models with less implementation level detail. Overall, our results suggest that transient dopamine levels in the basal ganglia that encode novelty could contribute to an uncertainty representation which efficiently drives exploration in reinforcement learning. Humans and other animals learn from rewards and losses resulting from their actions to maximise their chances of survival. In many cases, a trial-and-error process is necessary to determine the most rewarding action in a certain context. During this process, determining how much resource should be allocated to acquiring information ("exploration") and how much should be allocated to utilising the existing information to maximise reward ("exploitation") is key to the overall effectiveness, i.e., the maximisation of total reward obtained with a certain amount of effort. We propose a theory whereby an area within the mammalian brain called the basal ganglia integrates current knowledge about the mean reward, reward uncertainty and novelty of an action in order to implement an algorithm which optimally allocates resources between exploration and exploitation. We verify our theory using behavioural experiments and electrophysiological recording, and show in simulations that the model also achieves good performance in comparison with established benchmark algorithms.
机构:
UCL, Inst Cognit Neurosci, 17 Queen Sq, London WC1N 3AZ, England
UCL, Div Psychiat, London W1T 7NF, England
Hammersmith Hosp, MRC London Inst Med Sci, Robert Steiner MRI Unit, Psychiat Imaging Grp, London W12 0NN, England
Imperial Coll London, Hammersmith Hosp, Fac Med, Inst Clin Sci, London W12 0NN, EnglandUCL, Inst Cognit Neurosci, 17 Queen Sq, London WC1N 3AZ, England
Adams, Rick A.
Moutoussis, Michael
论文数: 0引用数: 0
h-index: 0
机构:
UCL, Wellcome Ctr Human Neuroimaging, London WC1N 3BG, England
Max Planck UCL Ctr Computat Psychiat & Ageing Res, London WC1B 5EH, EnglandUCL, Inst Cognit Neurosci, 17 Queen Sq, London WC1N 3AZ, England
Moutoussis, Michael
Nour, Matthew M.
论文数: 0引用数: 0
h-index: 0
机构:
Hammersmith Hosp, MRC London Inst Med Sci, Robert Steiner MRI Unit, Psychiat Imaging Grp, London W12 0NN, England
Imperial Coll London, Hammersmith Hosp, Fac Med, Inst Clin Sci, London W12 0NN, England
Kings Coll London, Inst Psychiat Psychol & Neurosci IoPPN, Dept Psychosis Studies, London SE5 8AF, EnglandUCL, Inst Cognit Neurosci, 17 Queen Sq, London WC1N 3AZ, England
Nour, Matthew M.
Dahoun, Tarik
论文数: 0引用数: 0
h-index: 0
机构:
Hammersmith Hosp, MRC London Inst Med Sci, Robert Steiner MRI Unit, Psychiat Imaging Grp, London W12 0NN, England
Imperial Coll London, Hammersmith Hosp, Fac Med, Inst Clin Sci, London W12 0NN, England
Univ Oxford, Warneford Hosp, Dept Psychiat, Oxford OX3 7JX, EnglandUCL, Inst Cognit Neurosci, 17 Queen Sq, London WC1N 3AZ, England
Dahoun, Tarik
论文数: 引用数:
h-index:
机构:
Lewis, Declan
Illingworth, Benjamin
论文数: 0引用数: 0
h-index: 0
机构:
UCL, Inst Cognit Neurosci, 17 Queen Sq, London WC1N 3AZ, EnglandUCL, Inst Cognit Neurosci, 17 Queen Sq, London WC1N 3AZ, England
Illingworth, Benjamin
Veronese, Mattia
论文数: 0引用数: 0
h-index: 0
机构:
Kings Coll London, Ctr Neuroimaging Sci, Inst Psychiat Psychol & Neurosci IoPPN, London SE5 8AF, EnglandUCL, Inst Cognit Neurosci, 17 Queen Sq, London WC1N 3AZ, England
Veronese, Mattia
Mathys, Christoph
论文数: 0引用数: 0
h-index: 0
机构:
Max Planck UCL Ctr Computat Psychiat & Ageing Res, London WC1B 5EH, England
Scuola Internazl Super Avanzati SISSA, I-34136 Trieste, Italy
Univ Zurich, Inst Biomed Engn, Translat Neuromodeling Unit TNU, CH-8032 Zurich, Switzerland
Swiss Fed Inst Technol, CH-8032 Zurich, SwitzerlandUCL, Inst Cognit Neurosci, 17 Queen Sq, London WC1N 3AZ, England
Mathys, Christoph
论文数: 引用数:
h-index:
机构:
de Boer, Lieke
Guitart-Masip, Marc
论文数: 0引用数: 0
h-index: 0
机构:
Max Planck UCL Ctr Computat Psychiat & Ageing Res, London WC1B 5EH, England
Karolinska Inst, Aging Res Ctr, S-17165 Stockholm, SwedenUCL, Inst Cognit Neurosci, 17 Queen Sq, London WC1N 3AZ, England
Guitart-Masip, Marc
Friston, Karl J.
论文数: 0引用数: 0
h-index: 0
机构:
UCL, Wellcome Ctr Human Neuroimaging, London WC1N 3BG, EnglandUCL, Inst Cognit Neurosci, 17 Queen Sq, London WC1N 3AZ, England
Friston, Karl J.
Howes, Oliver D.
论文数: 0引用数: 0
h-index: 0
机构:
Hammersmith Hosp, MRC London Inst Med Sci, Robert Steiner MRI Unit, Psychiat Imaging Grp, London W12 0NN, England
Imperial Coll London, Hammersmith Hosp, Fac Med, Inst Clin Sci, London W12 0NN, England
Kings Coll London, Inst Psychiat Psychol & Neurosci IoPPN, Dept Psychosis Studies, London SE5 8AF, EnglandUCL, Inst Cognit Neurosci, 17 Queen Sq, London WC1N 3AZ, England
Howes, Oliver D.
Roiser, Jonathan P.
论文数: 0引用数: 0
h-index: 0
机构:
UCL, Inst Cognit Neurosci, 17 Queen Sq, London WC1N 3AZ, EnglandUCL, Inst Cognit Neurosci, 17 Queen Sq, London WC1N 3AZ, England
机构:
Bowling Green State Univ, Dept Biol Sci, Bowling Green, OH 43403 USA
Bowling Green State Univ, JP Scott Ctr Neurosci Mind Behavior, Bowling Green, OH 43403 USAWashington State Univ, Coll Vet Med, Ctr Study Anim Well Being, Dept VCAPP, Pullman, WA 99163 USA
Alcaro, Antonio
Huber, Robert
论文数: 0引用数: 0
h-index: 0
机构:
Bowling Green State Univ, Dept Biol Sci, Bowling Green, OH 43403 USA
Bowling Green State Univ, JP Scott Ctr Neurosci Mind Behavior, Bowling Green, OH 43403 USAWashington State Univ, Coll Vet Med, Ctr Study Anim Well Being, Dept VCAPP, Pullman, WA 99163 USA
Huber, Robert
Panksepp, Jaak
论文数: 0引用数: 0
h-index: 0
机构:
Washington State Univ, Coll Vet Med, Ctr Study Anim Well Being, Dept VCAPP, Pullman, WA 99163 USA
Bowling Green State Univ, Dept Biol Sci, Bowling Green, OH 43403 USA
Bowling Green State Univ, JP Scott Ctr Neurosci Mind Behavior, Bowling Green, OH 43403 USAWashington State Univ, Coll Vet Med, Ctr Study Anim Well Being, Dept VCAPP, Pullman, WA 99163 USA
机构:
Univ Calif San Francisco, Dept Neurol, San Francisco, CA 94143 USA
Univ Calif San Francisco, Dept Psychiat, San Francisco, CA 94143 USA
Univ Calif San Francisco, Kavli Inst Fundamental Neurosci, San Francisco, CA 94143 USAUniv Calif San Francisco, Dept Neurol, San Francisco, CA 94143 USA
机构:
UCL, Inst Cognit Neurosci, 17 Queen Sq, London WC1N 3AZ, England
UCL, Div Psychiat, London W1T 7NF, England
Hammersmith Hosp, MRC London Inst Med Sci, Robert Steiner MRI Unit, Psychiat Imaging Grp, London W12 0NN, England
Imperial Coll London, Hammersmith Hosp, Fac Med, Inst Clin Sci, London W12 0NN, EnglandUCL, Inst Cognit Neurosci, 17 Queen Sq, London WC1N 3AZ, England
Adams, Rick A.
Moutoussis, Michael
论文数: 0引用数: 0
h-index: 0
机构:
UCL, Wellcome Ctr Human Neuroimaging, London WC1N 3BG, England
Max Planck UCL Ctr Computat Psychiat & Ageing Res, London WC1B 5EH, EnglandUCL, Inst Cognit Neurosci, 17 Queen Sq, London WC1N 3AZ, England
Moutoussis, Michael
Nour, Matthew M.
论文数: 0引用数: 0
h-index: 0
机构:
Hammersmith Hosp, MRC London Inst Med Sci, Robert Steiner MRI Unit, Psychiat Imaging Grp, London W12 0NN, England
Imperial Coll London, Hammersmith Hosp, Fac Med, Inst Clin Sci, London W12 0NN, England
Kings Coll London, Inst Psychiat Psychol & Neurosci IoPPN, Dept Psychosis Studies, London SE5 8AF, EnglandUCL, Inst Cognit Neurosci, 17 Queen Sq, London WC1N 3AZ, England
Nour, Matthew M.
Dahoun, Tarik
论文数: 0引用数: 0
h-index: 0
机构:
Hammersmith Hosp, MRC London Inst Med Sci, Robert Steiner MRI Unit, Psychiat Imaging Grp, London W12 0NN, England
Imperial Coll London, Hammersmith Hosp, Fac Med, Inst Clin Sci, London W12 0NN, England
Univ Oxford, Warneford Hosp, Dept Psychiat, Oxford OX3 7JX, EnglandUCL, Inst Cognit Neurosci, 17 Queen Sq, London WC1N 3AZ, England
Dahoun, Tarik
论文数: 引用数:
h-index:
机构:
Lewis, Declan
Illingworth, Benjamin
论文数: 0引用数: 0
h-index: 0
机构:
UCL, Inst Cognit Neurosci, 17 Queen Sq, London WC1N 3AZ, EnglandUCL, Inst Cognit Neurosci, 17 Queen Sq, London WC1N 3AZ, England
Illingworth, Benjamin
Veronese, Mattia
论文数: 0引用数: 0
h-index: 0
机构:
Kings Coll London, Ctr Neuroimaging Sci, Inst Psychiat Psychol & Neurosci IoPPN, London SE5 8AF, EnglandUCL, Inst Cognit Neurosci, 17 Queen Sq, London WC1N 3AZ, England
Veronese, Mattia
Mathys, Christoph
论文数: 0引用数: 0
h-index: 0
机构:
Max Planck UCL Ctr Computat Psychiat & Ageing Res, London WC1B 5EH, England
Scuola Internazl Super Avanzati SISSA, I-34136 Trieste, Italy
Univ Zurich, Inst Biomed Engn, Translat Neuromodeling Unit TNU, CH-8032 Zurich, Switzerland
Swiss Fed Inst Technol, CH-8032 Zurich, SwitzerlandUCL, Inst Cognit Neurosci, 17 Queen Sq, London WC1N 3AZ, England
Mathys, Christoph
论文数: 引用数:
h-index:
机构:
de Boer, Lieke
Guitart-Masip, Marc
论文数: 0引用数: 0
h-index: 0
机构:
Max Planck UCL Ctr Computat Psychiat & Ageing Res, London WC1B 5EH, England
Karolinska Inst, Aging Res Ctr, S-17165 Stockholm, SwedenUCL, Inst Cognit Neurosci, 17 Queen Sq, London WC1N 3AZ, England
Guitart-Masip, Marc
Friston, Karl J.
论文数: 0引用数: 0
h-index: 0
机构:
UCL, Wellcome Ctr Human Neuroimaging, London WC1N 3BG, EnglandUCL, Inst Cognit Neurosci, 17 Queen Sq, London WC1N 3AZ, England
Friston, Karl J.
Howes, Oliver D.
论文数: 0引用数: 0
h-index: 0
机构:
Hammersmith Hosp, MRC London Inst Med Sci, Robert Steiner MRI Unit, Psychiat Imaging Grp, London W12 0NN, England
Imperial Coll London, Hammersmith Hosp, Fac Med, Inst Clin Sci, London W12 0NN, England
Kings Coll London, Inst Psychiat Psychol & Neurosci IoPPN, Dept Psychosis Studies, London SE5 8AF, EnglandUCL, Inst Cognit Neurosci, 17 Queen Sq, London WC1N 3AZ, England
Howes, Oliver D.
Roiser, Jonathan P.
论文数: 0引用数: 0
h-index: 0
机构:
UCL, Inst Cognit Neurosci, 17 Queen Sq, London WC1N 3AZ, EnglandUCL, Inst Cognit Neurosci, 17 Queen Sq, London WC1N 3AZ, England
机构:
Bowling Green State Univ, Dept Biol Sci, Bowling Green, OH 43403 USA
Bowling Green State Univ, JP Scott Ctr Neurosci Mind Behavior, Bowling Green, OH 43403 USAWashington State Univ, Coll Vet Med, Ctr Study Anim Well Being, Dept VCAPP, Pullman, WA 99163 USA
Alcaro, Antonio
Huber, Robert
论文数: 0引用数: 0
h-index: 0
机构:
Bowling Green State Univ, Dept Biol Sci, Bowling Green, OH 43403 USA
Bowling Green State Univ, JP Scott Ctr Neurosci Mind Behavior, Bowling Green, OH 43403 USAWashington State Univ, Coll Vet Med, Ctr Study Anim Well Being, Dept VCAPP, Pullman, WA 99163 USA
Huber, Robert
Panksepp, Jaak
论文数: 0引用数: 0
h-index: 0
机构:
Washington State Univ, Coll Vet Med, Ctr Study Anim Well Being, Dept VCAPP, Pullman, WA 99163 USA
Bowling Green State Univ, Dept Biol Sci, Bowling Green, OH 43403 USA
Bowling Green State Univ, JP Scott Ctr Neurosci Mind Behavior, Bowling Green, OH 43403 USAWashington State Univ, Coll Vet Med, Ctr Study Anim Well Being, Dept VCAPP, Pullman, WA 99163 USA
机构:
Univ Calif San Francisco, Dept Neurol, San Francisco, CA 94143 USA
Univ Calif San Francisco, Dept Psychiat, San Francisco, CA 94143 USA
Univ Calif San Francisco, Kavli Inst Fundamental Neurosci, San Francisco, CA 94143 USAUniv Calif San Francisco, Dept Neurol, San Francisco, CA 94143 USA