The "Proactive" Model of Learning: Integrative Framework for Model-Free and Model-Based Reinforcement Learning Utilizing the Associative Learning-Based Proactive Brain Concept

被引：18

作者：

Zsuga, Judit ^{[1
]}

Biro, Klara ^{[1
]}

Papp, Csaba ^{[1
]}

Tajti, Gabor ^{[1
]}

Gesztelyi, Rudolf ^{[2
]}

机构：

[1] Univ Debrecen, Fac Publ Hlth, Dept Hlth Syst Management & Qual Management Hlth, Nagyerdei Krt 98, H-4032 Debrecen, Hungary

[2] Univ Debrecen, Fac Pharm, Dept Pharmacol, H-4032 Debrecen, Hungary

来源：

BEHAVIORAL NEUROSCIENCE | 2016年 / 130卷 / 01期

关键词：

model-free reinforcement learning; model-based reinforcement learning; reinforcement learning agent; proactive brain; default network; GOAL-DIRECTED BEHAVIORS; ORBITOFRONTAL CORTEX; DOPAMINE NEURONS; PREDICTION ERROR; PREFRONTAL CORTEX; VENTRAL STRIATUM; BASOLATERAL AMYGDALA; INCENTIVE SALIENCE; NUCLEUS-ACCUMBENS; REPRESENT REWARD;

D O I：

10.1037/bne0000116

中图分类号：

B84 [心理学]; C [社会科学总论]; Q98 [人类学];

学科分类号：

03 ; 0303 ; 030303 ; 04 ; 0402 ;

摘要：

Reinforcement learning (RL) is a powerful concept underlying forms of associative learning governed by the use of a scalar reward signal, with learning taking place if expectations are violated. RL may be assessed using model-based and model-free approaches. Model-based reinforcement learning involves the amygdala, the hippocampus, and the orbitofrontal cortex (OFC). The model-free system involves the pedunculopontine-tegmental nucleus (PPTgN), the ventral tegmental area (VTA) and the ventral striatum (VS). Based on the functional connectivity of VS, model-free and model based RL systems center on the VS that by integrating model-free signals (received as reward prediction error) and model-based reward related input computes value. Using the concept of reinforcement learning agent we propose that the VS serves as the value function component of the RL agent. Regarding the model utilized for model-based computations we turned to the proactive brain concept, which offers an ubiquitous function for the default network based on its great functional overlap with contextual associative areas. Hence, by means of the default network the brain continuously organizes its environment into context frames enabling the formulation of analogy-based association that are turned into predictions of what to expect. The OFC integrates reward-related information into context frames upon computing reward expectation by compiling stimulus-reward and context-reward information offered by the amygdala and hippocampus, respectively. Furthermore we suggest that the integration of model-based expectations regarding reward into the value signal is further supported by the efferent of the OFC that reach structures canonical for model-free learning (e.g., the PPTgN, VTA, and VS).

引用

页码：6 / 18

页数：13

共 50 条

[31] Model-based reinforcement learning under concurrent schedules of reinforcement in rodents
Huh, Namjung
Jo, Suhyun
Kim, Hoseok
Sul, Jung Hoon
Jung, Min Whan
LEARNING & MEMORY, 2009, 16 (05) : 315 - 323
[32] Model-Based Reinforcement Learning for Quantized Federated Learning Performance Optimization
Yang, Nuocheng
Wang, Sihua
Chen, Mingzhe
Brinton, Christopher G.
Yin, Changchuan
Saad, Walid
Cui, Shuguang
2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, : 5063 - 5068
[33] A Dynamic Bidding Strategy Based on Model-Free Reinforcement Learning in Display Advertising
Liu, Mengjuan
Jiaxing, Li
Hu, Zhengning
Liu, Jinyu
Nie, Xuyun
IEEE ACCESS, 2020, 8 : 213587 - 213601
[34] Stress reduces both model-based and model-free neural computations during flexible learning
Cremer, Anna
Kalbe, Felix
Glaescher, Jan
Schwabe, Lars
NEUROIMAGE, 2021, 229
[35] Integrating cortico-limbic-basal ganglia architectures for learning model-based and model-free navigation strategies
Khamassi, Mehdi
Humphries, Mark D.
FRONTIERS IN BEHAVIORAL NEUROSCIENCE, 2012, 6
[36] Model-Based Reinforcement Learning for Cavity Filter Tuning
Nimara, Doumitrou Daniil
Malek-Mohammadi, Mohammadreza
Wei, Jieqiang
Huang, Vincent
Ogren, Petter
LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211
[37] Physics-Informed Model-Based Reinforcement Learning
Ramesh, Adithya
Ravindran, Balaraman
LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211
[38] Value-Distributional Model-Based Reinforcement Learning
Luis, Carlos E.
Bottero, Alessandro G.
Vinogradska, Julia
Berkenkamp, Felix
Peters, Jan
JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25
[39] Model-based Reinforcement Learning for Decentralized Multiagent Rendezvous
Wang, Rose E.
Kew, J. Chase
Lee, Dennis
Lee, Tsang-Wei Edward
Zhang, Tingnan
Ichter, Brian
Tan, Jie
Faust, Aleksandra
CONFERENCE ON ROBOT LEARNING, VOL 155, 2020, 155 : 711 - 725
[40] Learnable Weighting Mechanism in Model-based Reinforcement Learning
Huang W.-Z.
Yin Q.-Y.
Zhang J.-G.
Huang K.-Q.
Ruan Jian Xue Bao/Journal of Software, 2023, 34 (06): : 2765 - 2775

← 1 2 3 4 5 →