Optimal Evolution Strategy for Continuous Strategy Games on Complex Networks via Reinforcement Learning

被引:2
作者
Fan, Litong [1 ,2 ]
Yu, Dengxiu [2 ]
Cheong, Kang Hao [3 ]
Wang, Zhen [4 ,5 ]
机构
[1] Northwestern Polytech Univ, Sch Mech Engn, Xian 710072, Peoples R China
[2] Northwestern Polytech Univ, Sch Artificial Intelligence Opt & Elect iOPEN, Xian 710072, Peoples R China
[3] Singapore Univ Technol & Design, Sci Math & Technol Cluster, Singapore 487372, Singapore
[4] Northwestern Polytech Univ, Sch Mech Engn iOPEN, Xian 710072, Peoples R China
[5] Northwestern Polytech Univ, Sch Cybersecur, Xian 710072, Peoples R China
基金
中国国家自然科学基金;
关键词
Continuous strategy games; evolutionary dynamic; Hamilton-Jacobi-Bellman (HJB); reinforcement learning (RL); strategy updating rules; COOPERATION; DYNAMICS; EMERGENCE;
D O I
10.1109/TNNLS.2024.3453385
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article presents an optimal evolution strategy for continuous strategy games on complex networks via reinforcement learning (RL). In the past, evolutionary game theory usually assumed that agents use the same selection intensity when interacting, ignoring the differences in their learning abilities and learning willingness. Individuals are reluctant to change their strategies too much. Therefore, we design an adaptive strategy updating framework with various selection intensities for continuous strategy games on complex networks based on imitation dynamics, allowing agents to achieve the optimal state and a higher cooperation level with the minimal strategy changes. The optimal updating strategy is acquired using a coupled Hamilton-Jacobi-Bellman (HJB) equation by minimizing the performance function. This function aims to maximize individual payoffs while minimizing strategy changes. Furthermore, a value iteration (VI) RL algorithm is proposed to approximate the HJB solutions and learn the optimal strategy updating rules. The RL algorithm employs actor and critic neural networks to approximate strategy changes and performance functions, along with the gradient descent weight update approach. Meanwhile, the stability and convergence of the proposed methods have been proved by the designed Lyapunov function. Simulations validate the convergence and effectiveness of the proposed methods in different games and complex networks.
引用
收藏
页码:12827 / 12839
页数:13
相关论文
共 50 条
[21]   The combined effects of conformity and reinforcement learning on the evolution of cooperation in public goods games [J].
Zhang, Lan ;
Li, Yuqin ;
Xie, Yuan ;
Feng, Yuee ;
Huang, Changwei .
CHAOS SOLITONS & FRACTALS, 2025, 193
[22]   Multistep Multiagent Reinforcement Learning for Optimal Energy Schedule Strategy of Charging Stations in Smart Grid [J].
Zhang, Yang ;
Yang, Qingyu ;
An, Dou ;
Li, Donghe ;
Wu, Zongze .
IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (07) :4292-4305
[23]   Data-Based Optimal Synchronization of Heterogeneous Multiagent Systems in Graphical Games via Reinforcement Learning [J].
Xiong, Chunping ;
Ma, Qian ;
Guo, Jian ;
Lewis, Frank L. .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (11) :15984-15992
[24]   A Self-Adaptive Strategy for Evolution of Cooperation in Distributed Networks [J].
Ye, Dayong ;
Zhang, Minjie .
IEEE TRANSACTIONS ON COMPUTERS, 2015, 64 (04) :899-911
[25]   The evolution of strategy in bacterial warfare via the regulation of bacteriocins and antibiotics [J].
Niehus, Rene ;
Oliveira, Nuno M. ;
Li, Aming ;
Fletcher, Alexander G. ;
Foster, Kevin R. .
ELIFE, 2021, 10
[26]   Beyond the Symmetric Normal Form: Extensive Form Games, Asymmetric Games and Games with Continuous Strategy Spaces [J].
Cressman, Ross .
EVOLUTIONARY GAME DYNAMICS, 2011, 69 :27-59
[27]   Observer-Based Optimal Backstepping Security Control for Nonlinear Systems Using Reinforcement Learning Strategy [J].
Wei, Qinglai ;
Chen, Wendi ;
Tan, Xiangmin ;
Xiao, Jun ;
Dong, Qi .
IEEE TRANSACTIONS ON CYBERNETICS, 2024, 54 (11) :7011-7023
[28]   Combined Heuristic Attack Strategy on Complex Networks [J].
Simon, Marek ;
Luptakova, Iveta Dirgova ;
Huraj, Ladislav ;
Host'ovecky, Marian ;
Pospichal, Jiri .
MATHEMATICAL PROBLEMS IN ENGINEERING, 2017, 2017
[29]   Pheromone Static Routing Strategy for Complex Networks [J].
Hu Mao-Bin ;
Lau, Henry Y. K. ;
Ling Xiang ;
Jiang Rui .
CHINESE PHYSICS LETTERS, 2012, 29 (12)
[30]   Complex networks of stakeholders and corporate political strategy [J].
Ferrary, Michel .
MANAGEMENT, 2019, 22 (03) :411-437