Reputation-Based Interaction Promotes Cooperation With Reinforcement Learning

被引:6
作者
Ren, Tianyu [1 ]
Zeng, Xiao-Jun [1 ]
机构
[1] Univ Manchester, Dept Comp Sci, Manchester M13 9PL, England
关键词
Games; Statistics; Sociology; Heuristic algorithms; Simulation; Task analysis; Intelligent agents; Computer simulation; evolutionary computation; intelligent agents; multiagent systems; nonlinear network analysis; EVOLUTION; GAMES; NORM;
D O I
10.1109/TEVC.2023.3304911
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dynamical interaction represents a fundamental coevolutionary rule that addresses the intricacies of cooperation in social dilemmas. It provides a normative account for the changes in ties within interaction networks in response to the behavior of social partners. While considerable efforts have explored the role of partner selection in fostering cooperation, there remains a limited understanding of how agents learn to establish effective interaction patterns and adapt their connections accordingly. To bridge this knowledge gap, we leverage recent advancements in reinforcement learning (RL) and propose an adaptive interaction mechanism to investigate self-organization behavior in the iterated prisoner's dilemma game. Within this framework, artificial agents are trained using a self-regarding Roth-Erev algorithm, utilizing reputation as a dynamic signal to update their willingness to engage with neighbors. Additionally, these agents are endowed with the capability to sever inactive connections. Simulation results demonstrate the effectiveness of utilizing RL and local information from reputation to capture the dynamics of interactions. Notably, we discover that the entangled coevolution of strategy and interaction network can facilitate the emergence and maintenance of cooperation, despite the optimal tolerance threshold for ineffective neighbors varying depending on the strength of the social dilemma. Furthermore, the emerging network topology presented in this work accurately captures the assortative mixing pattern observed in previous experiments and realistic evidence. Finally, we validate the simulation results through theoretical analysis and confirm the robustness of the proposed mechanism across populations of varying sizes and initial structures.
引用
收藏
页码:1177 / 1188
页数:12
相关论文
共 67 条
[31]   Coevolutionary games-A mini review [J].
Perc, Matjaz ;
Szolnoki, Attila .
BIOSYSTEMS, 2010, 99 (02) :109-125
[32]   Local reputation, local selection, and the leading eight norms [J].
Podder, Shirsendu ;
Righi, Simone ;
Takacs, Karoly .
SCIENTIFIC REPORTS, 2021, 11 (01)
[33]  
McKee KR, 2021, Arxiv, DOI [arXiv:2103.04982, 10.48550/ARXIV.2103.04982]
[34]   Human cooperation [J].
Rand, David G. ;
Nowak, Martin A. .
TRENDS IN COGNITIVE SCIENCES, 2013, 17 (08) :413-425
[35]   Dynamic social networks promote cooperation in experiments with humans [J].
Rand, David G. ;
Arbesman, Samuel ;
Christakis, Nicholas A. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2011, 108 (48) :19193-19198
[36]  
Rapoport A., 1965, PRISONERS DILEMMA ST, DOI DOI 10.3998/MPUB.20269
[37]   Evolutionary dynamics in the spatial public goods game with tolerance-based expulsion and cooperation [J].
Ren, Tianyu ;
Zheng, Junjun .
CHAOS SOLITONS & FRACTALS, 2021, 151
[38]   Unselfish traits and social decision-making patterns characterize six populations of real-world extraordinary altruists [J].
Rhoads, Shawn A. ;
Vekaria, Kruti M. ;
O'Connell, Katherine ;
Elizabeth, Hannah S. ;
Rand, David G. ;
Williams, Megan N. Kozak ;
Marsh, Abigail A. .
NATURE COMMUNICATIONS, 2023, 14 (01)
[39]   LEARNING IN EXTENSIVE-FORM GAMES - EXPERIMENTAL-DATA AND SIMPLE DYNAMIC-MODELS IN THE INTERMEDIATE-TERM [J].
ROTH, AE ;
EREV, I .
GAMES AND ECONOMIC BEHAVIOR, 1995, 8 (01) :164-212
[40]   Epidemic spreading and cooperation dynamics on homogeneous small-world networks [J].
Santos, FC ;
Rodrigues, JF ;
Pacheco, JM .
PHYSICAL REVIEW E, 2005, 72 (05)