Reinforcement Learning for Non-stationary Discrete-Time Linear-Quadratic Mean-Field Games in Multiple Populations

被引:9
|
作者
Zaman, Muhammad Aneeq Uz [1 ]
Miehling, Erik [1 ]
Basar, Tamer [1 ]
机构
[1] Univ Illinois, Coordinated Sci Lab, Urbana, IL 61801 USA
关键词
Mean-field games; Large population games on networks; Multi-agent reinforcement learning; Zero-order stochastic optimization; SYSTEMS;
D O I
10.1007/s13235-022-00448-w
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Scalability of reinforcement learning algorithms to multi-agent systems is a significant bottleneck to their practical use. In this paper, we approach multi-agent reinforcement learning from a mean-field game perspective, where the number of agents tends to infinity. Our analysis focuses on the structured setting of systems with linear dynamics and quadratic costs, named linear-quadratic mean-field games, evolving over a discrete-time infinite horizon where agents are assumed to be partitioned into finitely many populations connected by a network of known structure. The functional forms of the agents' costs and dynamics are assumed to be the same within populations, but differ between populations. We first characterize the equilibrium of the mean-field game which further prescribes an epsilon-Nash equilibrium for the finite population game. Our main focus is on the design of a learning algorithm, based on zero-order stochastic optimization, for computing mean-field equilibria. The algorithm exploits the affine structure of both the equilibrium controller and equilibrium mean-field trajectory by decomposing the learning task into first learning the linear terms and then learning the affine terms. We present a convergence proof and a finite-sample bound quantifying the estimation error as a function of the number of samples
引用
收藏
页码:118 / 164
页数:47
相关论文
共 50 条
  • [1] Reinforcement Learning in Non-Stationary Discrete-Time Linear-Quadratic Mean-Field Games
    Zaman, Muhammad Aneeq Uz
    Zhang, Kaiqing
    Miehling, Erik
    Basar, Tamer
    2020 59TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2020, : 2278 - 2284
  • [2] Reinforcement Learning for Non-stationary Discrete-Time Linear–Quadratic Mean-Field Games in Multiple Populations
    Muhammad Aneeq uz Zaman
    Erik Miehling
    Tamer Başar
    Dynamic Games and Applications, 2023, 13 : 118 - 164
  • [3] Approximate Equilibrium Computation for Discrete-Time Linear-Quadratic Mean-Field Games
    Zaman, Muhammad Aneeq Uz
    Zhang, Kaiqing
    Miehling, Erik
    Basar, Tamer
    2020 AMERICAN CONTROL CONFERENCE (ACC), 2020, : 333 - 339
  • [4] Linear-Quadratic Optimal Control for Discrete-Time Mean-Field Systems With Input Delay
    Qi, Qingyuan
    Xie, Lihua
    Zhang, Huanshui
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2022, 67 (08) : 3806 - 3821
  • [5] Linear-Quadratic Near-Optimal Controls For Discrete-Time Mean-Field Systems
    Qin, Xilin
    Hou, Ting
    2024 14TH ASIAN CONTROL CONFERENCE, ASCC 2024, 2024, : 945 - 950
  • [6] Discrete-Time Linear-Quadratic Dynamic Games
    Pachter, M.
    Pham, K. D.
    JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2010, 146 (01) : 151 - 179
  • [7] Discrete-Time Linear-Quadratic Dynamic Games
    M. Pachter
    K. D. Pham
    Journal of Optimization Theory and Applications, 2010, 146 : 151 - 179
  • [8] Mean-field linear-quadratic stochastic differential games
    Sun, Jingrui
    Wang, Hanxiao
    Wu, Zhen
    JOURNAL OF DIFFERENTIAL EQUATIONS, 2021, 296 : 299 - 334
  • [9] A discrete-time mean-field stochastic linear-quadratic optimal control problem with financial application
    Li, Xun
    Tai, Allen H.
    Tian, Fei
    INTERNATIONAL JOURNAL OF CONTROL, 2021, 94 (01) : 175 - 189
  • [10] Discrete-time mean-field stochastic linear-quadratic optimal control problem with finite horizon
    Song, Teng
    Liu, Bin
    ASIAN JOURNAL OF CONTROL, 2021, 23 (02) : 979 - 989