Optimal Evolution Strategy for Continuous Strategy Games on Complex Networks via Reinforcement Learning

被引：2

作者：

Fan, Litong ^{[1
,2
]}

Yu, Dengxiu ^{[2
]}

Cheong, Kang Hao ^{[3
]}

Wang, Zhen ^{[4
,5
]}

机构：

[1] Northwestern Polytech Univ, Sch Mech Engn, Xian 710072, Peoples R China

[2] Northwestern Polytech Univ, Sch Artificial Intelligence Opt & Elect iOPEN, Xian 710072, Peoples R China

[3] Singapore Univ Technol & Design, Sci Math & Technol Cluster, Singapore 487372, Singapore

[4] Northwestern Polytech Univ, Sch Mech Engn iOPEN, Xian 710072, Peoples R China

[5] Northwestern Polytech Univ, Sch Cybersecur, Xian 710072, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2025年 / 36卷 / 07期

基金：

中国国家自然科学基金;

关键词：

Continuous strategy games; evolutionary dynamic; Hamilton-Jacobi-Bellman (HJB); reinforcement learning (RL); strategy updating rules; COOPERATION; DYNAMICS; EMERGENCE;

D O I：

10.1109/TNNLS.2024.3453385

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This article presents an optimal evolution strategy for continuous strategy games on complex networks via reinforcement learning (RL). In the past, evolutionary game theory usually assumed that agents use the same selection intensity when interacting, ignoring the differences in their learning abilities and learning willingness. Individuals are reluctant to change their strategies too much. Therefore, we design an adaptive strategy updating framework with various selection intensities for continuous strategy games on complex networks based on imitation dynamics, allowing agents to achieve the optimal state and a higher cooperation level with the minimal strategy changes. The optimal updating strategy is acquired using a coupled Hamilton-Jacobi-Bellman (HJB) equation by minimizing the performance function. This function aims to maximize individual payoffs while minimizing strategy changes. Furthermore, a value iteration (VI) RL algorithm is proposed to approximate the HJB solutions and learn the optimal strategy updating rules. The RL algorithm employs actor and critic neural networks to approximate strategy changes and performance functions, along with the gradient descent weight update approach. Meanwhile, the stability and convergence of the proposed methods have been proved by the designed Lyapunov function. Simulations validate the convergence and effectiveness of the proposed methods in different games and complex networks.

引用

页码：12827 / 12839

页数：13

共 50 条

[41] Multiplayer Differential Games of Markov Jump Systems via Reinforcement Learning [J].

Wu, Jiacheng ;

Wang, Jing ;

Shen, Hao ;

Basin, Michael V. .

IEEE TRANSACTIONS ON CYBERNETICS, 2025, 55 (04) :1860-1872

[42] Online Reinforcement Learning-Based Strategy Learning in Iterated Prisoners Dilemma [J].

Xing, Xiaoyu ;

Xia, Haoxiang .

IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2025,

[43] Evolutionary dynamics on stochastic evolving networks for multiple-strategy games [J].

Wu, Bin ;

Zhou, Da ;

Wang, Long .

PHYSICAL REVIEW E, 2011, 84 (04)

[44] An Optimal Rewiring Strategy for Cooperative Multiagent Social Learning [J].

Tang, Hongyao ;

Hao, Jianye ;

Wang, Li ;

Wang, Zan ;

Baarslag, Tim .

AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, :2209-2211

[45] Reinforcement Learning for Fuzzy Structured Adaptive Optimal Control of Discrete-Time Nonlinear Complex Networks [J].

Wu, Tao ;

Cao, Jinde ;

Xiong, Lianglin ;

Park, Ju H. ;

Lam, Hak-Keung .

IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2024, 32 (11) :6035-6043

[46] Learning continuous and consistent strategy promotes cooperation in prisoner's dilemma game with mixed strategy [J].

Wang, Jianwei ;

Wang, Rong ;

Yu, Fengyuan ;

Wang, Ziwei ;

Li, Qiaochu .

APPLIED MATHEMATICS AND COMPUTATION, 2020, 370

[47] Optimal Group Consensus of Multiagent Systems in Graphical Games Using Reinforcement Learning [J].

Wang, Yuhan ;

Wang, Zhuping ;

Zhang, Hao ;

Yan, Huaicheng .

IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2025, 55 (03) :2343-2353

[48] Coevolution of strategy and structure in complex networks with dynamical linking [J].

Pacheco, Jorge M. ;

Traulsen, Arne ;

Nowak, Martin A. .

PHYSICAL REVIEW LETTERS, 2006, 97 (25)

[49] Research on one weighted routing strategy for complex networks [J].

Chen Hua-Liang ;

Liu Zhong-Xin ;

Chen Zeng-Qiang ;

Yuan Zhu-Zhi .

ACTA PHYSICA SINICA, 2009, 58 (09) :6068-6073

[50] Efficient weighting strategy for enhancing synchronizability of complex networks [J].

Wang, Youquan ;

Yu, Feng ;

Huang, Shucheng ;

Tu, Juanjuan ;

Chen, Yan .

MODERN PHYSICS LETTERS B, 2018, 32 (11)

← 1 2 3 4 5 →