Interaction state Q-learning promotes cooperation in the spatial prisoner's dilemma game

被引:26
作者
Yang, Zhengzhi [1 ]
Zheng, Lei [1 ]
Perc, Matjaz [2 ,3 ,4 ,5 ,6 ]
Li, Yumeng [1 ]
机构
[1] Beihang Univ, Beijing 100191, Peoples R China
[2] Univ Maribor, Fac Nat Sci & Math, Koroska Cesta 160, Maribor 2000, Slovenia
[3] China Med Univ, China Med Univ Hosp, Dept Med Res, Taichung 404332, Taiwan
[4] Alma Mater Europaea, Slovenska Ul 17, Maribor 2000, Slovenia
[5] Complex Sci Hub Vienna, Josefstadterstr 39, A-1080 Vienna, Austria
[6] Kyung Hee Univ, Dept Phys, 26 Kyungheedae Ro, Seoul, South Korea
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Evolutionary games; Cooperation; Prisoner's dilemma game; Reinforcement learning; EVOLUTIONARY GAMES; NETWORKS; FITNESS; GO;
D O I
10.1016/j.amc.2023.128364
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Many recent studies have used reinforcement learning methods to investigate the behavior of agents in evolutionary games. Q-learning, in particular, has become a mainstream method during this development. Here we introduce Q-learning agents into the evolutionary prisoner's dilemma game on a square lattice. Specifically, we associate the state space of Q-learning agents with the strategies of their neighbors, and we introduce a neighboring reward information sharing mechanism. We thus provide Q-learning agents with the payoff information of their neighbors, in addition to their strategies, which has not been done in previous studies. Through simulations, we show that considering neighborhood payoff information can significantly promote cooperation in the population. Moreover, we show that for an appropriate strength of neighborhood payoff information sharing, a chessboard pattern emerges on the lattice. We analyze in detail the reasons for the emergence of the chessboard pattern and the increase in cooperation frequency, and we also provide a theoretical analysis based on the pair approximation method. We hope that our research will inspire effective approaches for resolving social dilemmas by means of sharing more information among reinforcement learning agents during evolutionary games.
引用
收藏
页数:12
相关论文
共 67 条
[1]   Strategy equilibrium in dilemma games with off-diagonal payoff perturbations [J].
Amaral, Marco A. ;
Javarone, Marco A. .
PHYSICAL REVIEW E, 2020, 101 (06)
[2]   Heterogeneous update mechanisms in evolutionary games: Mixing innovative and imitative dynamics [J].
Amaral, Marco Antonio ;
Javarone, Marco Alberto .
PHYSICAL REVIEW E, 2018, 97 (04)
[3]   Trusting Intelligent Machines Deepening Trust within Socio-Technical Systems [J].
Andras, Peter ;
Esterle, Lukas ;
Guckert, Michael ;
The Anh Han ;
Lewis, Peter R. ;
Milanovic, Kristina ;
Payne, Terry ;
Perret, Cedric ;
Pitt, Jeremy ;
Powers, Simon T. ;
Urquhart, Neil ;
Wells, Simon .
IEEE TECHNOLOGY AND SOCIETY MAGAZINE, 2018, 37 (04) :76-83
[4]   THE EVOLUTION OF COOPERATION [J].
AXELROD, R ;
HAMILTON, WD .
SCIENCE, 1981, 211 (4489) :1390-1396
[5]   EFFECTIVE CHOICE IN THE PRISONERS-DILEMMA [J].
AXELROD, R .
JOURNAL OF CONFLICT RESOLUTION, 1980, 24 (01) :3-25
[6]   The evolutionary public goods game on scale-free networks with heterogeneous investment [J].
Cao, Xian-Bin ;
Du, Wen-Bo ;
Rong, Zhi-Hai .
PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2010, 389 (06) :1273-1280
[7]   Evolution of prisoner's dilemma strategies on scale-free networks [J].
Chen, Ya-Shan ;
Lin, Hai ;
Wu, Chen-Xu .
PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2007, 385 (01) :379-384
[8]   Does Spending More Always Ensure Higher Cooperation? An Analysis of Institutional Incentives on Heterogeneous Networks [J].
Cimpeanu, Theodor ;
Santos, Francisco C. ;
Han, The Anh .
DYNAMIC GAMES AND APPLICATIONS, 2023, 13 (4) :1236-1255
[9]   Social diversity reduces the complexity and cost of fostering fairness [J].
Cimpeanu, Theodor ;
Di Stefano, Alessandro ;
Perret, Cedric ;
Han, The Anh .
CHAOS SOLITONS & FRACTALS, 2023, 167
[10]  
Di Stefano A., 2023, Recognition of Behavioural Intention in Repeated Games Using Machine Learning