Introspection dynamics: a simple model of counterfactual learning in asymmetric games

被引:15
|
作者
Couto, M. C. [1 ]
Giaimo, S. [2 ]
Hilbe, C. [1 ]
机构
[1] Max Planck Inst Evolutionary Biol, Max Planck Res Grp Dynam Social Behav, D-24306 Plon, Germany
[2] Max Planck Inst Evolutionary Biol, Dept Evolutionary Theory, D-24306 Plon, Germany
来源
NEW JOURNAL OF PHYSICS | 2022年 / 24卷 / 06期
基金
欧洲研究理事会;
关键词
evolutionary game theory; counterfactual learning; myopic updating; asymmetric games; social dilemmas; volunteer's dilemma; PRISONERS-DILEMMA GAME; EVOLUTIONARY DYNAMICS; STATISTICAL-MECHANICS; IMITATION PROCESSES; BIMATRIX GAMES; STRATEGIES; COOPERATION; MUTATIONS; SELECTION; FIXATION;
D O I
10.1088/1367-2630/ac6f76
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Social behavior in human and animal populations can be studied as an evolutionary process. Individuals often make decisions between different strategies, and those strategies that yield a fitness advantage tend to spread. Traditionally, much work in evolutionary game theory considers symmetric games: individuals are assumed to have access to the same set of strategies, and they experience the same payoff consequences. As a result, they can learn more profitable strategies by imitation. However, interactions are oftentimes asymmetric. In that case, imitation may be infeasible (because individuals differ in the strategies they are able to use), or it may be undesirable (because individuals differ in their incentives to use a strategy). Here, we consider an alternative learning process which applies to arbitrary asymmetric games, introspection dynamics. According to this dynamics, individuals regularly compare their present strategy to a randomly chosen alternative strategy. If the alternative strategy yields a payoff advantage, it is more likely adopted. In this work, we formalize introspection dynamics for pairwise games. We derive simple and explicit formulas for the abundance of each strategy over time and apply these results to several well-known social dilemmas. In particular, for the volunteer's timing dilemma, we show that the player with the lowest cooperation cost learns to cooperate without delay.
引用
收藏
页数:23
相关论文
共 40 条
  • [31] Comparing a simple theoretical model for protein folding with all-atom molecular dynamics simulations
    Henry, Eric R.
    Best, Robert B.
    Eaton, William A.
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2013, 110 (44) : 17880 - 17885
  • [32] NEURAL-NETWORK MODEL OF THE DYNAMICS OF HUNGER, LEARNING, AND ACTION VIGOR IN MICE
    Venditti, Alberto
    Mirolli, Marco
    Parisi, Domenico
    Baldassarre, Gianluca
    ARTIFICIAL LIFE AND EVOLUTIONARY COMPUTATION, 2010, : 131 - 142
  • [33] A Structured Model of Relationship Dynamics Between Organizational Knowledge Management and Organizational Learning
    Goldman, Fernando Luiz
    PROCEEDINGS OF THE 2ND EUROPEAN CONFERENCE ON INTELLECTUAL CAPITAL, 2010, : 257 - 264
  • [34] Model identification of reduced order fluid dynamics systems using deep learning
    Wang, Z.
    Xiao, D.
    Fang, F.
    Govindan, R.
    Pain, C. C.
    Guo, Y.
    INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN FLUIDS, 2018, 86 (04) : 255 - 268
  • [35] Simple and complex dynamics in the model of evolution of two populations coupled by migration with non-overlapping generations
    Kulakov, M. P.
    Frisman, E. Ya
    IZVESTIYA VYSSHIKH UCHEBNYKH ZAVEDENIY-PRIKLADNAYA NELINEYNAYA DINAMIKA, 2022, 30 (02): : 208 - 232
  • [37] Cognitive Model of Trust Dynamics Predicts Human Behavior within and between Two Games of Strategic Interaction with Computerized Confederate Agents
    Collins, Michael G.
    Juvina, Ion
    Gluck, Kevin A.
    FRONTIERS IN PSYCHOLOGY, 2016, 7
  • [38] Statistical Learning of Lattice Option Pricing and Traders' Behavior Using Ising Spin Model for Asymmetric Information Transitions
    Sen, Prabir
    Ma, Nang Laik
    INTELLIGENT COMPUTING, VOL 2, 2019, 857 : 1 - 17
  • [39] Model-based reinforcement learning for on-line feedback-Nash equilibrium solution of N-player nonzero-sum differential games
    Kamalapurkar, Rushikesh
    Klotz, Justin
    Dixon, Warren E.
    2014 AMERICAN CONTROL CONFERENCE (ACC), 2014, : 3000 - 3005
  • [40] Mapping the distribution and dynamics of coastal aquaculture ponds using Landsat time series data based on U2-Net deep learning model
    Chen, Chao
    Zou, Zhaohui
    Sun, Weiwei
    Yang, Gang
    Song, Yongze
    Liu, Zhisong
    INTERNATIONAL JOURNAL OF DIGITAL EARTH, 2024, 17 (01)