PPO-ACT: Proximal policy optimization with adversarial curriculum transfer for spatial public goods games

被引:0
作者
Yang, Zhaoqilin [1 ,2 ]
Li, Chanchan [3 ]
Wang, Xin [4 ]
Tian, Youliang [2 ,5 ]
机构
[1] Guizhou Univ, Coll Comp Sci & Technol, State Key Lab Publ Big Data, Guiyang 550025, Guizhou, Peoples R China
[2] Guizhou Univ, Inst Cryptog & Data Secur, Guiyang 550025, Guizhou, Peoples R China
[3] Guizhou Univ, Coll Math & Stat, State Key Lab Publ Big Data, Guiyang 550025, Guizhou, Peoples R China
[4] Beijing Jiaotong Univ, Sch Math & Stat, Beijing 100044, Peoples R China
[5] Guizhou Univ, Coll Big Data & Informat Engn, State Key Lab Publ Big Data, Guiyang 550025, Guizhou, Peoples R China
基金
中国国家自然科学基金;
关键词
Public goods game; Deep reinforcement learning; Proximal policy optimization; Adversarial curriculum transfer; EVOLUTIONARY GAMES; DYNAMICS;
D O I
10.1016/j.chaos.2025.116762
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
This study investigates cooperation evolution mechanisms in the spatial public goods game. A novel deep reinforcement learning framework, Proximal Policy Optimization with Adversarial Curriculum Transfer (PPO-ACT), is proposed to model agent strategy optimization in dynamic environments. Traditional evolutionary game models often exhibit limitations in modeling long-term decision-making processes. Imitation-based rules (e.g., Fermi) lack strategic foresight, while tabular methods (e.g., Q-learning) fail to capture spatial-temporal correlations. Deep reinforcement learning effectively addresses these limitation by bridging policy gradient methods with evolutionary game theory. Our study pioneers the application of proximal policy optimization's continuous strategy optimization capability to public goods games through a two-stage adversarial curriculum transfer training paradigm. The experimental results show that PPO-ACT performs better in critical enhancement factor regimes. Compared to conventional standard proximal policy optimization methods, Q-learning and Fermi update rules, achieve earlier cooperation phase transitions and maintain stable cooperative equilibria. This framework exhibits better robustness when handling challenging scenarios like all-defector initial conditions. Systematic comparisons reveal the unique advantage of policy gradient methods in population-scale cooperation, i.e., achieving spatiotemporal payoff coordination through value function propagation. Our work provides a new computational framework for studying cooperation emergence in complex systems, algorithmically validating the punishment promotes cooperation hypothesis while offering methodological insights for multi-agent system strategy design.
引用
收藏
页数:12
相关论文
共 58 条
[41]   Intuitionistic Fuzzy MADM in Wargame Leveraging With Deep Reinforcement Learning [J].
Sun, Yuxiang ;
Li, Yuanbai ;
Li, Huaxiong ;
Liu, Jiubing ;
Zhou, Xianzhong .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2024, 32 (09) :5033-5045
[42]  
Sutton RS, 2018, ADAPT COMPUT MACH LE, P1
[43]   Evolutionary prisoner's dilemma game on a square lattice [J].
Szabo, G ;
Toke, C .
PHYSICAL REVIEW E, 1998, 58 (01) :69-73
[44]   Evolutionary games on graphs [J].
Szabo, Gyoergy ;
Fath, Gabor .
PHYSICS REPORTS-REVIEW SECTION OF PHYSICS LETTERS, 2007, 446 (4-6) :97-216
[45]   Coexistence of fraternity and egoism for spatial social dilemmas [J].
Szabo, Gyoergy ;
Szolnoki, Attila ;
Czako, Lilla .
JOURNAL OF THEORETICAL BIOLOGY, 2013, 317 :126-132
[46]   Alliance formation with exclusion in the spatial public goods game [J].
Szolnoki, Attila ;
Chen, Xiaojie .
PHYSICAL REVIEW E, 2017, 95 (05) :052316
[47]   Impact of critical mass on the evolution of cooperation in spatial public goods games [J].
Szolnoki, Attila ;
Perc, Matjaz .
PHYSICAL REVIEW E, 2010, 81 (05)
[48]   Topology-independent impact of noise on cooperation in spatial public goods games [J].
Szolnoki, Attila ;
Perc, Matjaz ;
Szabo, Gyoergy .
PHYSICAL REVIEW E, 2009, 80 (05)
[49]   Cooperative emergence of spatial public goods games with reputation discount accumulation [J].
Tang, Wei ;
Wang, Chun ;
Pi, Jinxiu ;
Yang, Hui .
NEW JOURNAL OF PHYSICS, 2024, 26 (01)
[50]   Levy noise promotes cooperation in the prisoner's dilemma game with reinforcement learning [J].
Wang, Lu ;
Jia, Danyang ;
Zhang, Long ;
Zhu, Peican ;
Perc, Matjaz ;
Shi, Lei ;
Wang, Zhen .
NONLINEAR DYNAMICS, 2022, 108 (02) :1837-1845