A Gradient-based reinforcement learning model of market equilibration

被引：0

作者：

He, Zhongzhi ^{[1
]}

机构：

[1] Brock Univ, Goodman Sch Business, St Catharines, ON L2S 3A1, Canada

来源：

JOURNAL OF ECONOMIC DYNAMICS & CONTROL | 2023年 / 152卷

关键词：

Reinforcement learning; Machine learning; Stochastic gradient method; Model free simulation; Call market; Market equilibration; Exploitation and exploration; RATIONALITY; DESIGN;

D O I：

10.1016/j.jedc.2023.104670

中图分类号：

F [经济];

学科分类号：

02 ;

摘要：

This paper formulates a game-theoretic reinforcement learning model based on the stochastic gradient method whereby players start from their initial circumstances with dispersed information, using the expected gradient to update choice propensities, and converge to the predicted equilibrium of belief-based models. Gradient-based reinforcement learning (G-RL) entails a model-free simulation method to estimate the gradient of expected payoff with respect to choice propensities in repeated games. As the gradient points to the steepest direction towards discovering steady-state equilibrium, G-RL provides a theoretical justification for a probability-weighed time-varying updating rule that optimally balances the trade-off between reinforcing past successful strategies ('exploitation') and exploring other strategies ('exploration') in choosing actions. The effectiveness and stability of G-RL are demonstrated in a simulated call market, where both the actual effect and the foregone effect are simultaneously updated during market equilibration. In contrast, the failure of payoff-based reinforcement learning (P-RL) is due to its constant-sensitivity updating rule, which causes an imbalance between exploitation and exploration in complex environments. (c) 2023 Elsevier B.V. All rights reserved.

引用

页数：21

共 50 条

[1] Direct gradient-based reinforcement learning
Baxter, J
Bartlett, PL
ISCAS 2000: IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS - PROCEEDINGS, VOL III: EMERGING TECHNOLOGIES FOR THE 21ST CENTURY, 2000, : 271 - 274
[2] Direct gradient-based reinforcement learning for robot behavior learning
El-Fakdi, Andres
Carreras, Marc
Ridao, Pere
INFORMATICS IN CONTROL, AUTOMATION AND ROBOTICS II, 2007, : 175 - +
[3] Estimation and approximation bounds for gradient-based reinforcement learning
Bartlett, PL
Baxter, J
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2002, 64 (01) : 133 - 150
[4] Inverse Reinforcement Learning from a Gradient-based Learner
Ramponi, Giorgia
Drappo, Gianluca
Restelli, Marcello
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[5] A Gradient-Based Reinforcement Learning Algorithm for Multiple Cooperative Agents
Zhang, Zhen
Wang, Dongqing
Zhao, Dongbin
Han, Qiaoni
Song, Tingting
IEEE ACCESS, 2018, 6 : 70223 - 70235
[6] Traffic Light Control with Policy Gradient-Based Reinforcement Learning
Tas, Mehmet Bilge Han
Ozkan, Kemal
Saricicek, Inci
Yazici, Ahmet
32ND IEEE SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU 2024, 2024,
[7] Gradient-Based Inverse Risk-Sensitive Reinforcement Learning
Mazumdar, Eric
Ratliff, Lillian J.
Fiez, Tanner
Sastry, S. Shankar
2017 IEEE 56TH ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2017,
[8] Gradient-Based Minimization for Multi-Expert Inverse Reinforcement Learning
Tateo, Davide
Pirotta, Matteo
Restelli, Marcello
Bonarini, Andrea
2017 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2017, : 815 - 822
[9] Optimizing thermodynamic trajectories using evolutionary and gradient-based reinforcement learning
Beeler, Chris
Yahorau, Uladzimir
Coles, Rory
Mills, Kyle
Whitelam, Stephen
Tamblyn, Isaac
PHYSICAL REVIEW E, 2021, 104 (06)
[10] Reinforcement learning for enhanced online gradient-based parameter adaptation in metaheuristics
Tatsis, Vasileios A.
Parsopoulos, Konstantinos E.
SWARM AND EVOLUTIONARY COMPUTATION, 2023, 83

← 1 2 3 4 5 →