Statistical Mechanics of Reward-Modulated Learning in Decision-Making Networks

被引:3
作者
Katahira, Kentaro [1 ,2 ,3 ]
Okanoya, Kazuo [1 ,3 ,4 ]
Okada, Masato [1 ,2 ,3 ]
机构
[1] Japan Sci & Technol Agcy, ERATO, Okanoya Emot Informat Project, Wako, Saitama 3510198, Japan
[2] Univ Tokyo, Grad Sch Frontier Sci, Chiba 2775861, Japan
[3] RIKEN Brain Sci Inst, Wako, Saitama 3510198, Japan
[4] Univ Tokyo, Grad Sch Arts & Sci, Tokyo 1538902, Japan
关键词
DEPENDENT SYNAPTIC PLASTICITY; MATCHING LAW; CORTICAL CIRCUITS; REINFORCEMENT; BEHAVIOR; MODEL; CORTEX; CHOICE; CELLS;
D O I
10.1162/NECO_a_00264
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The neural substrates of decision making have been intensively studied using experimental and computational approaches. Alternative-choice tasks accompanying reinforcement have often been employed in investigations into decision making. Choice behavior has been empirically found in many experiments to follow Herrnstein's matching law. A number of theoretical studies have been done on explaining the mechanisms responsible for matching behavior. Various learning rules have been proved in these studies to achieve matching behavior as a steady state of learning processes. The models in the studies have consisted of a few parameters. However, a large number of neurons and synapses are expected to participate in decision making in the brain. We investigated learning behavior in simple but large-scale decision-making networks. We considered the covariance learning rule, which has been demonstrated to achieve matching behavior as a steady state (Loewenstein & Seung, 2006). We analyzed model behavior in a thermodynamic limit where the number of plastic synapses went to infinity. By means of techniques of the statistical mechanics, we can derive deterministic differential equations in this limit for the order parameters, which allow an exact calculation of the evolution of choice behavior. As a result, we found that matching behavior cannot be a steady state of learning when the fluctuations in input from individual sensory neurons are so large that they affect the net input to value-encoding neurons. This situation naturally arises when the synaptic strength is sufficiently strong and the excitatory input and the inhibitory input to the value-encoding neurons are balanced. The deviation from matching behavior is caused by increasing variance in the input potential due to the diffusion of synaptic efficacies. This effect causes an undermatching phenomenon, which has been often observed in behavioral experiments.
引用
收藏
页码:1230 / 1270
页数:41
相关论文
共 50 条
  • [31] Heterogeneous appetite patterns in depression: computational modeling of nutritional interoception, reward processing, and decision-making
    Uchida, Yuuki
    Hikida, Takatoshi
    Honda, Manabu
    Yamashita, Yuichi
    FRONTIERS IN HUMAN NEUROSCIENCE, 2024, 18
  • [32] Reward-driven changes in striatal pathway competition shape evidence evaluation in decision-making
    Dunovan, Kyle
    Vich, Catalina
    Clapp, Matthew
    Verstynen, Timothy
    Rubin, Jonathan
    PLOS COMPUTATIONAL BIOLOGY, 2019, 15 (05)
  • [33] Structure Learning in Human Sequential Decision-Making
    Acuna, Daniel E.
    Schrater, Paul
    PLOS COMPUTATIONAL BIOLOGY, 2010, 6 (12)
  • [34] Toward Nonprobabilistic Explanations of Learning and Decision-Making
    Szollosi, Aba
    Donkin, Chris
    Newell, Ben R.
    PSYCHOLOGICAL REVIEW, 2023, 130 (02) : 546 - 568
  • [35] Improved deep reinforcement learning for car-following decision-making
    Yang, Xiaoxue
    Zou, Yajie
    Zhang, Hao
    Qu, Xiaobo
    Chen, Lei
    PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2023, 624
  • [36] Computation noise in human learning and decision-making: origin, impact, function
    Findling, Charles
    Wyart, Valentin
    CURRENT OPINION IN BEHAVIORAL SCIENCES, 2021, 38 : 124 - 132
  • [37] Probabilistic design of optimal sequential decision-making algorithms in learning and control
    Garrabe, Emiland
    Russo, Giovanni
    ANNUAL REVIEWS IN CONTROL, 2022, 54 : 81 - 102
  • [38] Deep Reinforcement Learning Enabled Decision-Making for Autonomous Driving at Intersections
    Li, Guofa
    Li, Shenglong
    Li, Shen
    Qin, Yechen
    Cao, Dongpu
    Qu, Xingda
    Cheng, Bo
    AUTOMOTIVE INNOVATION, 2020, 3 (04) : 374 - 385
  • [39] Deliberative Decision-Making in Macaques Removes Reward-Driven Response Vigor
    Daddaoua, Nabil
    Jedema, Hank P.
    Bradberry, Charles W.
    FRONTIERS IN BEHAVIORAL NEUROSCIENCE, 2021, 15
  • [40] DISTRIBUTED DECISION-MAKING OVER MOBILE ADAPTIVE NETWORKS
    Khawatmi, Sahar
    Huang, Xinxin
    Zoubir, Abdelhak M.
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 3864 - 3868