Actor-Critic Reinforcement Learning Algorithms for Mean Field Games in Continuous Time, State and Action Spaces

被引：0

作者：

Liang, Hong ^{[1
,2
]}

Chen, Zhiping ^{[1
,2
]}

Jing, Kaili ^{[3
]}

机构：

[1] Xi An Jiao Tong Univ, Sch Math & Stat, Xian 710049, Shaanxi, Peoples R China

[2] Xian Int Acad Math & Math Technol, Ctr Optimizat Tech & Quantitat Finance, Xian 710049, Shaanxi, Peoples R China

[3] Univ Ottawa, Dept Math & Stat, Ottawa, ON K1N 6N5, Canada

来源：

APPLIED MATHEMATICS AND OPTIMIZATION | 2024年 / 89卷 / 03期

基金：

国家重点研发计划; 中国国家自然科学基金;

关键词：

Mean field games; Entropy regularization; Martingale; Actor-critic algorithms; Linear-quadratic games; SYSTEMS; MODEL;

D O I：

10.1007/s00245-024-10138-1

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

This paper investigates mean field games in continuous time, state and action spaces with an infinite number of agents, where each agent aims to maximize its expected cumulative reward. Using the technique of randomized policies, we show policy evaluation and policy gradient are equivalent to the martingale conditions of a process by focusing on a representative agent. Then combined with fictitious game, we propose online and offline actor-critic algorithms for solving continuous mean field games that update the value function and policy alternatively under the given population state and action distributions. We demonstrate through two numerical experiments that our proposed algorithms can converge to the mean field equilibrium quickly and stably.

引用

页数：35

共 50 条

[1] Policy Gradient and Actor-Critic Learning in Continuous Time and Space: Theory and Algorithms
Jia, Yanwei
Zhou, Xun Yu
JOURNAL OF MACHINE LEARNING RESEARCH, 2022, 23
[2] Reinforcement Learning Using a Continuous Time Actor-Critic Framework with Spiking Neurons
Fremaux, Nicolas
Sprekeler, Henning
Gerstner, Wulfram
PLOS COMPUTATIONAL BIOLOGY, 2013, 9 (04):
[3] Actor-Critic Learning Algorithms for Mean-Field Control with Moment Neural Networks
Pham, Huyen
Warin, Xavier
METHODOLOGY AND COMPUTING IN APPLIED PROBABILITY, 2025, 27 (01)
[4] Actor-Critic Algorithms for Constrained Multi-agent Reinforcement Learning
Diddigi, Raghuram Bharadwaj
Reddy, D. Sai Koti
Prabuchandran, K. J.
Bhatnagar, Shalabh
AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 1931 - 1933
[5] Actor-Critic based Improper Reinforcement Learning
Zaki, Mohammadi
Mohan, Avinash
Gopalan, Aditya
Mannor, Shie
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[6] Lexicographic Actor-Critic Deep Reinforcement Learning for Urban Autonomous Driving
Zhang, Hengrui
Lin, Youfang
Han, Sheng
Lv, Kai
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (04) : 4308 - 4319
[7] Robust Actor-Critic Learning for Continuous-Time Nonlinear Systems With Unmodeled Dynamics
Yang, Yongliang
Gao, Weinan
Modares, Hamidreza
Xu, Cheng-Zhong
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2022, 30 (06) : 2101 - 2112
[8] A Deep Actor-Critic Reinforcement Learning Framework for Dynamic Multichannel Access
Zhong, Chen
Lu, Ziyang
Gursoy, M. Cenk
Velipasalar, Senem
IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2019, 5 (04) : 1125 - 1139
[9] A bounded actor-critic reinforcement learning algorithm applied to airline revenue management
Lawhead, Ryan J.
Gosavi, Abhijit
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2019, 82 : 252 - 262
[10] Actor-Critic Reinforcement Learning and Application in Developing Computer-Vision-Based Interface Tracking
Dogru, Oguzhan
Velswamy, Kirubakaran
Huang, Biao
ENGINEERING, 2021, 7 (09) : 1248 - 1261

← 1 2 3 4 5 →