A Gradient-based reinforcement learning model of market equilibration

被引:0
|
作者
He, Zhongzhi [1 ]
机构
[1] Brock Univ, Goodman Sch Business, St Catharines, ON L2S 3A1, Canada
来源
JOURNAL OF ECONOMIC DYNAMICS & CONTROL | 2023年 / 152卷
关键词
Reinforcement learning; Machine learning; Stochastic gradient method; Model free simulation; Call market; Market equilibration; Exploitation and exploration; RATIONALITY; DESIGN;
D O I
10.1016/j.jedc.2023.104670
中图分类号
F [经济];
学科分类号
02 ;
摘要
This paper formulates a game-theoretic reinforcement learning model based on the stochastic gradient method whereby players start from their initial circumstances with dispersed information, using the expected gradient to update choice propensities, and converge to the predicted equilibrium of belief-based models. Gradient-based reinforcement learning (G-RL) entails a model-free simulation method to estimate the gradient of expected payoff with respect to choice propensities in repeated games. As the gradient points to the steepest direction towards discovering steady-state equilibrium, G-RL provides a theoretical justification for a probability-weighed time-varying updating rule that optimally balances the trade-off between reinforcing past successful strategies ('exploitation') and exploring other strategies ('exploration') in choosing actions. The effectiveness and stability of G-RL are demonstrated in a simulated call market, where both the actual effect and the foregone effect are simultaneously updated during market equilibration. In contrast, the failure of payoff-based reinforcement learning (P-RL) is due to its constant-sensitivity updating rule, which causes an imbalance between exploitation and exploration in complex environments. (c) 2023 Elsevier B.V. All rights reserved.
引用
收藏
页数:21
相关论文
共 50 条
  • [1] Direct gradient-based reinforcement learning
    Baxter, J
    Bartlett, PL
    ISCAS 2000: IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS - PROCEEDINGS, VOL III: EMERGING TECHNOLOGIES FOR THE 21ST CENTURY, 2000, : 271 - 274
  • [2] Direct gradient-based reinforcement learning for robot behavior learning
    El-Fakdi, Andres
    Carreras, Marc
    Ridao, Pere
    INFORMATICS IN CONTROL, AUTOMATION AND ROBOTICS II, 2007, : 175 - +
  • [3] Estimation and approximation bounds for gradient-based reinforcement learning
    Bartlett, PL
    Baxter, J
    JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2002, 64 (01) : 133 - 150
  • [4] Inverse Reinforcement Learning from a Gradient-based Learner
    Ramponi, Giorgia
    Drappo, Gianluca
    Restelli, Marcello
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [5] A Gradient-Based Reinforcement Learning Algorithm for Multiple Cooperative Agents
    Zhang, Zhen
    Wang, Dongqing
    Zhao, Dongbin
    Han, Qiaoni
    Song, Tingting
    IEEE ACCESS, 2018, 6 : 70223 - 70235
  • [6] Traffic Light Control with Policy Gradient-Based Reinforcement Learning
    Tas, Mehmet Bilge Han
    Ozkan, Kemal
    Saricicek, Inci
    Yazici, Ahmet
    32ND IEEE SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU 2024, 2024,
  • [7] Gradient-Based Inverse Risk-Sensitive Reinforcement Learning
    Mazumdar, Eric
    Ratliff, Lillian J.
    Fiez, Tanner
    Sastry, S. Shankar
    2017 IEEE 56TH ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2017,
  • [8] Gradient-Based Minimization for Multi-Expert Inverse Reinforcement Learning
    Tateo, Davide
    Pirotta, Matteo
    Restelli, Marcello
    Bonarini, Andrea
    2017 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2017, : 815 - 822
  • [9] Optimizing thermodynamic trajectories using evolutionary and gradient-based reinforcement learning
    Beeler, Chris
    Yahorau, Uladzimir
    Coles, Rory
    Mills, Kyle
    Whitelam, Stephen
    Tamblyn, Isaac
    PHYSICAL REVIEW E, 2021, 104 (06)
  • [10] Reinforcement learning for enhanced online gradient-based parameter adaptation in metaheuristics
    Tatsis, Vasileios A.
    Parsopoulos, Konstantinos E.
    SWARM AND EVOLUTIONARY COMPUTATION, 2023, 83