Exploitation by asymmetry of information reference in coevolutionary learning in prisoner's dilemma game

被引:9
作者
Fujimoto, Yuma [1 ]
Kaneko, Kunihiko [2 ]
机构
[1] SOKENDAI, Dept Evolutionary Studies Biosyst, Sch Adv Sci, Hayama, Kanagawa, Japan
[2] Univ Tokyo, Dept Basic Sci, Tokyo, Japan
来源
JOURNAL OF PHYSICS-COMPLEXITY | 2021年 / 2卷 / 04期
关键词
repeated games; prisoner's dilemma; exploitation; learning; TIT-FOR-TAT; MEMORY-ONE STRATEGIES; WIN-STAY; STOCHASTIC STRATEGIES; INTENTION RECOGNITION; LOSE-SHIFT; EVOLUTION; COOPERATION; GENEROSITY; DYNAMICS;
D O I
10.1088/2632-072X/ac301a
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Mutual relationships, such as cooperation and exploitation, are the basis of human and other biological societies. The foundations of these relationships are rooted in the decision making of individuals, and whether they choose to be selfish or altruistic. How individuals choose their behaviors can be analyzed using a strategy optimization process in the framework of game theory. Previous studies have shown that reference to individuals' previous actions plays an important role in their choice of strategies and establishment of social relationships. A fundamental question remains as to whether an individual with more information can exploit another who has less information when learning the choice of strategies. Here we demonstrate that a player using a memory-one strategy, who can refer to their own previous action and that of their opponent, can be exploited by a reactive player, who only has the information of the other player, based on mutual adaptive learning. This is counterintuitive because the former has more choice in strategies and can potentially obtain a higher payoff. We demonstrated this by formulating the learning process of strategy choices to optimize the payoffs in terms of coupled replicator dynamics and applying it to the prisoner's dilemma game. Further, we show that the player using a memory-one strategy, by referring to their previous experience, can sometimes act more generous toward the opponent's defection, thereby accepting the opponent's exploitation. Mainly, we found that through adaptive learning, a player with limited information usually exploits the player with more information, leading to asymmetric exploitation.
引用
收藏
页数:15
相关论文
共 61 条
  • [1] Evolutionary instability of zero-determinant strategies demonstrates that winning is not everything
    Adami, Christoph
    Hintze, Arend
    [J]. NATURE COMMUNICATIONS, 2013, 4
  • [2] Stochastic win-stay-lose-shift strategy with dynamic aspirations in evolutionary social dilemmas
    Amaral, Marco A.
    Wardil, Lucas
    Perc, Matjaz
    da Silva, Jafferson K. L.
    [J]. PHYSICAL REVIEW E, 2016, 94 (03)
  • [3] Intention recognition promotes the emergence of cooperation
    Anh, Han The
    Pereira, Luis Moniz
    Santos, Francisco C.
    [J]. ADAPTIVE BEHAVIOR, 2011, 19 (04) : 264 - 279
  • [4] [Anonymous], 1998, EVOLUTIONARY GAMES P
  • [5] THE FURTHER EVOLUTION OF COOPERATION
    AXELROD, R
    DION, D
    [J]. SCIENCE, 1988, 242 (4884) : 1385 - 1390
  • [6] THE EVOLUTION OF COOPERATION
    AXELROD, R
    HAMILTON, WD
    [J]. SCIENCE, 1981, 211 (4489) : 1390 - 1396
  • [7] Axelrod R., 2006, The evolution of cooperation
  • [8] Comparing reactive and memory-one strategies of direct reciprocity
    Baek, Seung Ki
    Jeong, Hyeong-Chai
    Hilbe, Christian
    Nowak, Martin A.
    [J]. SCIENTIFIC REPORTS, 2016, 6
  • [9] Learning through reinforcement and replicator dynamics
    Borgers, T
    Sarin, R
    [J]. JOURNAL OF ECONOMIC THEORY, 1997, 77 (01) : 1 - 14
  • [10] NO PURE STRATEGY IS EVOLUTIONARILY STABLE IN THE REPEATED PRISONERS-DILEMMA GAME
    BOYD, R
    LORBERBAUM, JP
    [J]. NATURE, 1987, 327 (6117) : 58 - 59