Emergence of exploitation as symmetry breaking in iterated prisoner's dilemma

被引：12

作者：

Fujimoto, Yuma ^{[1
]}

Kaneko, Kunihiko ^{[1
,2
]}

机构：

[1] Univ Tokyo, Grad Sch Arts & Sci, Dept Basic Sci, Meguro Ku, 3-8-1 Komaba, Tokyo 1538902, Japan

[2] Univ Tokyo, Universal Biol Inst, Res Ctr Complex Syst Biol, 3-8-1 Komaba, Tokyo 1538902, Japan

来源：

PHYSICAL REVIEW RESEARCH | 2019年 / 1卷 / 03期

关键词：

EVOLUTION; REINFORCEMENT; STRATEGIES; EXTORTION;

D O I：

10.1103/PhysRevResearch.1.033077

中图分类号：

O4 [物理学];

学科分类号：

0702 ;

摘要：

In society, mutual cooperation, defection, and exploitative relationships are common. Whereas cooperation and defection are studied extensively in the literature on game theory, exploitative relationships between players, in which one receives a larger benefit than the other while the game itself is symmetric, are little explored. In a recent seminal study, Press and Dyson demonstrated that if only one player can learn about the other, asymmetric exploitation is achieved in the prisoner's dilemma game. In their study, however, asymmetry is assumed in decision making between persons; the exploiting player one-sidedly determines and fixes the strategy and the exploited player follows it. It is unknown whether such exploitation emerges and is stably established even when both players learn about each other symmetrically and try to optimize their payoffs. Here, we first formulate a dynamical system that describes the change in a player's probabilistic strategy with reinforcement learning to obtain greater payoffs, based on the recognition of the other player. By applying this formulation to the standard prisoner's dilemma game, we numerically and analytically demonstrate that an exploitative relationship can be achieved despite symmetric strategy dynamics and symmetric rule of games. This exploitative relationship is stabilized by both the players: The exploiting player demands the other's unfair cooperation. Even though the exploited player, who receives a lower payoff than the exploiting player, has optimized the own strategy, the player accepts the other's defection to some degree. Whether the final equilibrium state is mutual cooperation, defection, or exploitation crucially depends on the initial conditions. Response to decrease the cooperation probability against a defector leads to oscillations in the probabilities of cooperation between the players and thus a complicated basin structure to the final equilibrium. In particular, any slight difference between both players' initial strategies can be amplified and fixed as a large difference in the probabilities of cooperation, leading to fixation of exploitation. In other words, symmetry breaking between the exploiting and exploited players results. Considering the generality of the result, this study provides another perspective on the origin of exploitation in society.

引用

页数：10

共 50 条

[41] Multiagent Reinforcement Learning: Spiking and Nonspiking Agents in the Iterated Prisoner's Dilemma
Vassiliades, Vassilis
Cleanthous, Aristodemos
Christodoulou, Chris
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2011, 22 (04): : 639 - 653
[42] Optimal Strategies of the Iterated Prisoner's Dilemma Problem for Multiple Conflicting Objectives
Mittal, Shashi
Deb, Kalyanmoy
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2009, 13 (03) : 554 - 565
[43] Inferring strategies from observations in long iterated Prisoner's dilemma experiments
Montero-Porras, Eladio
Grujic, Jelena
Domingos, Elias Fernandez
Lenaerts, Tom
SCIENTIFIC REPORTS, 2022, 12 (01)
[44] Experimental criteria to identify efficient probabilistic memory-one strategies for the iterated prisoner's dilemma
Mathieu, Philippe
Delahaye, Jean-Paul
SIMULATION MODELLING PRACTICE AND THEORY, 2019, 97
[45] Evolving learning rules and emergence of cooperation in spatial prisoner's dilemma
Moyano, Luis G.
Sanchez, Angel
JOURNAL OF THEORETICAL BIOLOGY, 2009, 259 (01) : 84 - 95
[46] Evolution reinforces cooperation with the emergence of self-recognition mechanisms: An empirical study of strategies in the Moran process for the iterated prisoner's dilemma
Knight, Vincent
Harper, Marc
Glynatsi, Nikoleta E.
Campbell, Owen
PLOS ONE, 2018, 13 (10):
[47] Exploitation by asymmetry of information reference in coevolutionary learning in prisoner's dilemma game
Fujimoto, Yuma
Kaneko, Kunihiko
JOURNAL OF PHYSICS-COMPLEXITY, 2021, 2 (04):
[48] Combined trust model based on evidence theory in iterated prisoner's dilemma game
Chen, Bo
Zhang, Bin
Zhu, Weidong
INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2011, 42 (01) : 63 - 80
[49] Asymmetric iterated prisoner's dilemma on weighted complex networks and evolutionary strategies analysis
Ding, Yunhao
Zhang, Chunyan
Zhang, Jianlei
JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2024, 2024 (10):
[50] Personal Reflections on Some Early Work in Evolving Strategies in the Iterated Prisoner's Dilemma
Fogel, David B.
EVOLUTIONARY COMPUTATION, 2023, 31 (02) : 157 - 161

← 1 2 3 4 5 →