The "Proactive" Model of Learning: Integrative Framework for Model-Free and Model-Based Reinforcement Learning Utilizing the Associative Learning-Based Proactive Brain Concept

被引:18
|
作者
Zsuga, Judit [1 ]
Biro, Klara [1 ]
Papp, Csaba [1 ]
Tajti, Gabor [1 ]
Gesztelyi, Rudolf [2 ]
机构
[1] Univ Debrecen, Fac Publ Hlth, Dept Hlth Syst Management & Qual Management Hlth, Nagyerdei Krt 98, H-4032 Debrecen, Hungary
[2] Univ Debrecen, Fac Pharm, Dept Pharmacol, H-4032 Debrecen, Hungary
关键词
model-free reinforcement learning; model-based reinforcement learning; reinforcement learning agent; proactive brain; default network; GOAL-DIRECTED BEHAVIORS; ORBITOFRONTAL CORTEX; DOPAMINE NEURONS; PREDICTION ERROR; PREFRONTAL CORTEX; VENTRAL STRIATUM; BASOLATERAL AMYGDALA; INCENTIVE SALIENCE; NUCLEUS-ACCUMBENS; REPRESENT REWARD;
D O I
10.1037/bne0000116
中图分类号
B84 [心理学]; C [社会科学总论]; Q98 [人类学];
学科分类号
03 ; 0303 ; 030303 ; 04 ; 0402 ;
摘要
Reinforcement learning (RL) is a powerful concept underlying forms of associative learning governed by the use of a scalar reward signal, with learning taking place if expectations are violated. RL may be assessed using model-based and model-free approaches. Model-based reinforcement learning involves the amygdala, the hippocampus, and the orbitofrontal cortex (OFC). The model-free system involves the pedunculopontine-tegmental nucleus (PPTgN), the ventral tegmental area (VTA) and the ventral striatum (VS). Based on the functional connectivity of VS, model-free and model based RL systems center on the VS that by integrating model-free signals (received as reward prediction error) and model-based reward related input computes value. Using the concept of reinforcement learning agent we propose that the VS serves as the value function component of the RL agent. Regarding the model utilized for model-based computations we turned to the proactive brain concept, which offers an ubiquitous function for the default network based on its great functional overlap with contextual associative areas. Hence, by means of the default network the brain continuously organizes its environment into context frames enabling the formulation of analogy-based association that are turned into predictions of what to expect. The OFC integrates reward-related information into context frames upon computing reward expectation by compiling stimulus-reward and context-reward information offered by the amygdala and hippocampus, respectively. Furthermore we suggest that the integration of model-based expectations regarding reward into the value signal is further supported by the efferent of the OFC that reach structures canonical for model-free learning (e.g., the PPTgN, VTA, and VS).
引用
收藏
页码:6 / 18
页数:13
相关论文
共 50 条
  • [1] Predictive representations can link model-based reinforcement learning to model-free mechanisms
    Russek, Evan M.
    Momennejad, Ida
    Botvinick, Matthew M.
    Gershman, Samuel J.
    Daw, Nathaniel D.
    PLOS COMPUTATIONAL BIOLOGY, 2017, 13 (09)
  • [2] Model-based learning and the contribution of the orbitofrontal cortex to the model-free world
    McDannald, Michael A.
    Takahashi, Yuji K.
    Lopatina, Nina
    Pietras, Brad W.
    Jones, Josh L.
    Schoenbaum, Geoffrey
    EUROPEAN JOURNAL OF NEUROSCIENCE, 2012, 35 (07) : 991 - 996
  • [3] Variability in Dopamine Genes Dissociates Model-Based and Model-Free Reinforcement Learning
    Doll, Bradley B.
    Bath, Kevin G.
    Daw, Nathaniel D.
    Frank, Michael J.
    JOURNAL OF NEUROSCIENCE, 2016, 36 (04) : 1211 - 1222
  • [4] Successor Features Combine Elements of Model-Free and Model-based Reinforcement Learning
    Lehnert, Lucas
    Littman, Michael L.
    JOURNAL OF MACHINE LEARNING RESEARCH, 2020, 21
  • [5] Sliding mode heading control for AUV based on continuous hybrid model-free and model-based reinforcement learning
    Wang, Dianrui
    Shen, Yue
    Wan, Junhe
    Sha, Qixin
    Li, Guangliang
    Chen, Guanzhong
    He, Bo
    APPLIED OCEAN RESEARCH, 2022, 118
  • [6] Impairment of arbitration between model-based and model-free reinforcement learning in obsessive-compulsive disorder
    Ruan, Zhongqiang
    Seger, Carol A.
    Yang, Qiong
    Kim, Dongjae
    Lee, Sang Wan
    Chen, Qi
    Peng, Ziwen
    FRONTIERS IN PSYCHIATRY, 2023, 14
  • [7] Model-based learning retrospectively updates model-free values
    Doody, Max
    Van Swieten, Maaike M. H.
    Manohar, Sanjay G.
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [8] The ubiquity of model-based reinforcement learning
    Doll, Bradley B.
    Simon, Dylan A.
    Daw, Nathaniel D.
    CURRENT OPINION IN NEUROBIOLOGY, 2012, 22 (06) : 1075 - 1081
  • [9] The involvement of model-based but not model-free learning signals during observational reward learning in the absence of choice
    Dunne, Simon
    D'Souza, Arun
    O'Doherty, John P.
    JOURNAL OF NEUROPHYSIOLOGY, 2016, 115 (06) : 3195 - 3203
  • [10] Ventral Striatum and Orbitofrontal Cortex Are Both Required for Model-Based, But Not Model-Free, Reinforcement Learning
    McDannald, Michael A.
    Lucantonio, Federica
    Burke, Kathryn A.
    Niv, Yael
    Schoenbaum, Geoffrey
    JOURNAL OF NEUROSCIENCE, 2011, 31 (07) : 2700 - 2705