Learning in Games via Reinforcement and Regularization

被引:83
|
作者
Mertikopoulos, Panayotis [1 ,2 ]
Sandholm, William H. [3 ]
机构
[1] CNRS, French Natl Ctr Sci Res, LIG, F-38000 Grenoble, France
[2] Univ Grenoble Alpes, LIG, F-38000 Grenoble, France
[3] Univ Wisconsin, Dept Econ, Madison, WI 53706 USA
基金
美国国家科学基金会;
关键词
Bregman divergence; dominated strategies; equilibrium stability; Fenchel coupling; penalty functions; projection dynamics; regularization; reinforcement learning; replicator dynamics; time averages; DYNAMICAL-SYSTEMS; CONVERGENCE; REPLICATOR; STABILITY; GEOMETRY;
D O I
10.1287/moor.2016.0778
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
We investigate a class of reinforcement learning dynamics where players adjust their strategies based on their actions' cumulative payoffs over time-specifically, by playing mixed strategies that maximize their expected cumulative payoff minus a regularization term. A widely studied example is exponential reinforcement learning, a process induced by an entropic regularization term which leads mixed strategies to evolve according to the replicator dynamics. However, in contrast to the class of regularization functions used to define smooth best responses in models of stochastic fictitious play, the functions used in this paper need not be infinitely steep at the boundary of the simplex; in fact, dropping this requirement gives rise to an important dichotomy between steep and nonsteep cases. In this general framework, we extend several properties of exponential learning, including the elimination of dominated strategies, the asymptotic stability of strict Nash equilibria, and the convergence of time-averaged trajectories in zero-sum games with an interior Nash equilibrium.
引用
收藏
页码:1297 / 1324
页数:28
相关论文
共 50 条
  • [1] Reinforcement learning in population games
    Lahkar, Ratul
    Seymour, Robert M.
    GAMES AND ECONOMIC BEHAVIOR, 2013, 80 : 10 - 38
  • [2] Drafting in Collectible Card Games via Reinforcement Learning
    Vieira, Ronaldo
    Tavares, Anderson Rocha
    Chaimowicz, Luiz
    2020 19TH BRAZILIAN SYMPOSIUM ON COMPUTER GAMES AND DIGITAL ENTERTAINMENT (SBGAMES 2020), 2020, : 54 - 61
  • [3] Offline Reinforcement Learning With Behavior Value Regularization
    Huang, Longyang
    Dong, Botao
    Xie, Wei
    Zhang, Weidong
    IEEE TRANSACTIONS ON CYBERNETICS, 2024, 54 (06) : 3692 - 3704
  • [4] Social aspiration reinforcement learning in Cournot games
    Fatas, Enrique
    Morales, Antonio J.
    Jaramillo-Gutierrez, Ainhoa
    ECONOMIC THEORY, 2024,
  • [5] Rethinking Discount Regularization: New Interpretations, Unintended Consequences, and Solutions for Regularization in Reinforcement Learning
    Rathnam, Sarah
    Parbhoo, Sonali
    Swaroop, Siddharth
    Pan, Weiwei
    Murphy, Susan A.
    Doshi-Velez, Finale
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25 : 1 - 48
  • [6] Playing Games with Reinforcement Learning via Perceiving Orientation and Exploring Diversity
    Zhang, Dong
    Yang, Le
    Shi, Haobin
    Mou, Fangqing
    Hu, Mengkai
    PROCEEDINGS OF 2017 IEEE INTERNATIONAL CONFERENCE ON PROGRESS IN INFORMATICS AND COMPUTING (PIC 2017), 2017, : 30 - 34
  • [7] Reinforcement learning applied to games
    Crespo, Joao
    Wichert, Andreas
    SN APPLIED SCIENCES, 2020, 2 (05):
  • [8] Reinforcement learning applied to games
    João Crespo
    Andreas Wichert
    SN Applied Sciences, 2020, 2
  • [9] On the robustness of learning in games with stochastically perturbed payoff observations
    Bravo, Mario
    Mertikopoulos, Panayotis
    GAMES AND ECONOMIC BEHAVIOR, 2017, 103 : 41 - 66
  • [10] On Passivity, Reinforcement Learning, and Higher Order Learning in Multiagent Finite Games
    Gao, Bolin
    Pavel, Lacra
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2021, 66 (01) : 121 - 136