Levy noise promotes cooperation in the prisoner's dilemma game with reinforcement learning

被引:62
作者
Wang, Lu [1 ,2 ]
Jia, Danyang [1 ,2 ]
Zhang, Long [2 ,3 ]
Zhu, Peican [2 ,3 ]
Perc, Matjaz [4 ,5 ,6 ,7 ]
Shi, Lei [8 ]
Wang, Zhen [9 ,10 ]
机构
[1] Northwestern Polytech Univ, Sch Mech Engn, Xian 710072, Peoples R China
[2] Northwestern Polytech Univ, Sch Artificial Intelligence Opt & Elect iOPEN, Xian 710072, Peoples R China
[3] Northwestern Polytech Univ, Sch Comp Sci, Xian 710072, Peoples R China
[4] Univ Maribor, Fac Nat Sci & Math, Maribor, Slovenia
[5] China Med Univ, China Med Univ Hosp, Dept Med Res, Taichung 404332, Taiwan
[6] Complex Sci Hub Vienna, Vienna, Austria
[7] Alma Mater Europaea, Maribor, Slovenia
[8] Yunnan Univ Finance & Econ, Sch Math & Stat, Kunming 650221, Yunnan, Peoples R China
[9] Northwestern Polytech Univ, Sch Mech Engn, Sch Artificial Intelligence Opt & Elect iOPEN, Xian 710072, Peoples R China
[10] Northwestern Polytech Univ, Sch Cybersecur, Xian 710072, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Evolutionary dynamics; Prisoner's dilemma; Cooperation; Self-regarding Q-learning; Levy noise; TIT-FOR-TAT; PUNISHMENT;
D O I
10.1007/s11071-022-07289-7
中图分类号
TH [机械、仪表工业];
学科分类号
0802 ;
摘要
Uncertainties are ubiquitous in everyday life, and it is thus important to explore their effects on the evolution of cooperation. In this paper, the prisoner's dilemma game with reinforcement learning subject to Levy noise is studied. Specifically, diverse fluctuations mimicked by Levy distributed noise are reflected in the payoff matrix of each player. At the same time, the self-regarding Q-learning algorithm is considered as the strategy update rule to learn the behavior that achieves the highest payoff. The results show that not only does Levy noise promote the evolution of cooperation with reinforcement learning, it does so comparatively better than Gaussian noise. We explain this with the iterative updating pattern of the self-regarding Q-learning algorithm, which has an accumulative effect on the noise entering the payoff matrix. It turns out that under Levy noise, the Q-value of cooperative behavior becomes significantly larger than that of defective behavior when the current strategy is defection, which ultimately leads to the prevalence of cooperation, while this is absent with Gaussian noise or without noise. This research thus unveils a particular positive role of Levy noise in the evolutionary dynamics of social dilemmas.
引用
收藏
页码:1837 / 1845
页数:9
相关论文
共 54 条
[1]   Stochastic win-stay-lose-shift strategy with dynamic aspirations in evolutionary social dilemmas [J].
Amaral, Marco A. ;
Wardil, Lucas ;
Perc, Matjaz ;
da Silva, Jafferson K. L. .
PHYSICAL REVIEW E, 2016, 94 (03)
[2]   Enhancement of cooperation in highly clustered scale-free networks [J].
Assenza, Salvatore ;
Gomez-Gardenes, Jesus ;
Latora, Vito .
PHYSICAL REVIEW E, 2008, 78 (01)
[3]   Cooperate or Not Cooperate in Predictable but Periodically Varying Situations? Cooperation in Fast Oscillating Environment [J].
Babajanyan, S. G. ;
Lin, Wayne ;
Cheong, Kang Hao .
ADVANCED SCIENCE, 2020, 7 (21)
[4]   Intelligent tit-for-tat in the iterated prisoner's dilemma game [J].
Baek, Seung Ki ;
Kim, Beom Jun .
PHYSICAL REVIEW E, 2008, 78 (01)
[5]   Reward, Punishment, and Cooperation: A Meta-Analysis [J].
Balliet, Daniel ;
Mulder, Laetitia B. ;
Van Lange, Paul A. M. .
PSYCHOLOGICAL BULLETIN, 2011, 137 (04) :594-615
[6]   Promotion of cooperation induced by appropriate payoff aspirations in a small-world networked game [J].
Chen, Xiaojie ;
Wang, Long .
PHYSICAL REVIEW E, 2008, 77 (01)
[7]   Aspiration-based coevolution of node weights promotes cooperation in the spatial prisoner's dilemma game [J].
Chu, Chen ;
Mu, Chunjiang ;
Liu, Jinzhuo ;
Liu, Chen ;
Boccaletti, Stefano ;
Shi, Lei ;
Wang, Zhen .
NEW JOURNAL OF PHYSICS, 2019, 21 (06)
[8]  
Darwin C, 2009, ON THE ORIGIN OF SPECIES, P1, DOI 10.1017/CBO9780511694295.004
[9]   Memory-based stag hunt game on regular lattices [J].
Dong, Yukun ;
Xu, Hedong ;
Fan, Suohai .
PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2019, 519 :247-255
[10]   Reinforcement Learning Explains Conditional Cooperation and Its Moody Cousin [J].
Ezaki, Takahiro ;
Horita, Yutaka ;
Takezawa, Masanori ;
Masuda, Naoki .
PLOS COMPUTATIONAL BIOLOGY, 2016, 12 (07)