Learning in Games via Reinforcement and Regularization

被引:83
|
作者
Mertikopoulos, Panayotis [1 ,2 ]
Sandholm, William H. [3 ]
机构
[1] CNRS, French Natl Ctr Sci Res, LIG, F-38000 Grenoble, France
[2] Univ Grenoble Alpes, LIG, F-38000 Grenoble, France
[3] Univ Wisconsin, Dept Econ, Madison, WI 53706 USA
基金
美国国家科学基金会;
关键词
Bregman divergence; dominated strategies; equilibrium stability; Fenchel coupling; penalty functions; projection dynamics; regularization; reinforcement learning; replicator dynamics; time averages; DYNAMICAL-SYSTEMS; CONVERGENCE; REPLICATOR; STABILITY; GEOMETRY;
D O I
10.1287/moor.2016.0778
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
We investigate a class of reinforcement learning dynamics where players adjust their strategies based on their actions' cumulative payoffs over time-specifically, by playing mixed strategies that maximize their expected cumulative payoff minus a regularization term. A widely studied example is exponential reinforcement learning, a process induced by an entropic regularization term which leads mixed strategies to evolve according to the replicator dynamics. However, in contrast to the class of regularization functions used to define smooth best responses in models of stochastic fictitious play, the functions used in this paper need not be infinitely steep at the boundary of the simplex; in fact, dropping this requirement gives rise to an important dichotomy between steep and nonsteep cases. In this general framework, we extend several properties of exponential learning, including the elimination of dominated strategies, the asymptotic stability of strict Nash equilibria, and the convergence of time-averaged trajectories in zero-sum games with an interior Nash equilibrium.
引用
收藏
页码:1297 / 1324
页数:28
相关论文
共 50 条
  • [21] A model for the evolution of reinforcement learning in fluctuating games
    Dridi, Slimane
    Lehmann, Laurent
    ANIMAL BEHAVIOUR, 2015, 104 : 87 - 114
  • [22] Transient and asymptotic dynamics of reinforcement learning in games
    Izquierdo, Luis R.
    Izquierdo, Segismundo S.
    Gotts, Nicholas M.
    Polhill, J. Gary
    GAMES AND ECONOMIC BEHAVIOR, 2007, 61 (02) : 259 - 276
  • [23] PAC Reinforcement Learning Algorithm for General-Sum Markov Games
    Zehfroosh, Ashkan
    Tanner, Herbert G.
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (05) : 2821 - 2831
  • [24] PyTAG: Tabletop Games for Multiagent Reinforcement Learning
    Balla, Martin
    Long, George E. M.
    Goodman, James
    Gaina, Raluca D.
    Perez-Liebana, Diego
    IEEE TRANSACTIONS ON GAMES, 2024, 16 (04) : 993 - 1002
  • [25] Quantum Reinforcement Learning Applied to Board Games
    Teixeira, Miguel
    Rocha, Ana Paula
    Castro, Antonio J. M.
    2021 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY (WI-IAT 2021), 2021, : 343 - 350
  • [26] Emergence of anti-coordination through reinforcement learning in generalized minority games
    Chakrabarti, Anindya S.
    Ghosh, Diptesh
    JOURNAL OF ECONOMIC INTERACTION AND COORDINATION, 2019, 14 (02) : 225 - 245
  • [27] Emergence of anti-coordination through reinforcement learning in generalized minority games
    Anindya S. Chakrabarti
    Diptesh Ghosh
    Journal of Economic Interaction and Coordination, 2019, 14 : 225 - 245
  • [28] Leveraging Joint-Action Embedding in Multiagent Reinforcement Learning for Cooperative Games
    Lou, Xingzhou
    Zhang, Junge
    Du, Yali
    Yu, Chao
    He, Zhaofeng
    Huang, Kaiqi
    IEEE TRANSACTIONS ON GAMES, 2024, 16 (02) : 470 - 482
  • [29] The limits and robustness of reinforcement learning in Lewis signalling games
    Catteeuw, David
    Manderick, Bernard
    CONNECTION SCIENCE, 2014, 26 (02) : 161 - 177
  • [30] Generalized reinforcement learning in perfect-information games
    Maxwell Pak
    Bing Xu
    International Journal of Game Theory, 2016, 45 : 985 - 1011