Optimal Evolution Strategy for Continuous Strategy Games on Complex Networks via Reinforcement Learning

被引：2

作者：

Fan, Litong ^{[1
,2
]}

Yu, Dengxiu ^{[2
]}

Cheong, Kang Hao ^{[3
]}

Wang, Zhen ^{[4
,5
]}

机构：

[1] Northwestern Polytech Univ, Sch Mech Engn, Xian 710072, Peoples R China

[2] Northwestern Polytech Univ, Sch Artificial Intelligence Opt & Elect iOPEN, Xian 710072, Peoples R China

[3] Singapore Univ Technol & Design, Sci Math & Technol Cluster, Singapore 487372, Singapore

[4] Northwestern Polytech Univ, Sch Mech Engn iOPEN, Xian 710072, Peoples R China

[5] Northwestern Polytech Univ, Sch Cybersecur, Xian 710072, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2024年

基金：

中国国家自然科学基金;

关键词：

Continuous strategy games; evolutionary dynamic; Hamilton-Jacobi-Bellman (HJB); reinforcement learning (RL); strategy updating rules; COOPERATION; DYNAMICS; EMERGENCE;

D O I：

10.1109/TNNLS.2024.3453385

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This article presents an optimal evolution strategy for continuous strategy games on complex networks via reinforcement learning (RL). In the past, evolutionary game theory usually assumed that agents use the same selection intensity when interacting, ignoring the differences in their learning abilities and learning willingness. Individuals are reluctant to change their strategies too much. Therefore, we design an adaptive strategy updating framework with various selection intensities for continuous strategy games on complex networks based on imitation dynamics, allowing agents to achieve the optimal state and a higher cooperation level with the minimal strategy changes. The optimal updating strategy is acquired using a coupled Hamilton-Jacobi-Bellman (HJB) equation by minimizing the performance function. This function aims to maximize individual payoffs while minimizing strategy changes. Furthermore, a value iteration (VI) RL algorithm is proposed to approximate the HJB solutions and learn the optimal strategy updating rules. The RL algorithm employs actor and critic neural networks to approximate strategy changes and performance functions, along with the gradient descent weight update approach. Meanwhile, the stability and convergence of the proposed methods have been proved by the designed Lyapunov function. Simulations validate the convergence and effectiveness of the proposed methods in different games and complex networks.

引用

页数：13

共 50 条

[1] Integral-Reinforcement-Learning-Based Hierarchical Optimal Evolutionary Strategy for Continuous Action Social Dilemma Games [J].

Fan, Litong ;

Yu, Dengxiu ;

Wang, Zhen .

IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, 11 (05) :6807-6818

[2] A reinforcement learning-based strategy updating model for the cooperative evolution [J].

Wang, Xianjia ;

Yang, Zhipeng ;

Liu, Yanli ;

Chen, Guici .

PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2023, 618

[3] Evolutionary dynamics of continuous strategy games on graphs and social networks under weak selection [J].

Zhong, Weicai ;

Liu, Jing ;

Zhang, Li .

BIOSYSTEMS, 2013, 111 (02) :102-110

[4] Strategy evolution on higher-order networks [J].

Sheng, Anzhi ;

Su, Qi ;

Wang, Long ;

Plotkin, Joshua B. .

NATURE COMPUTATIONAL SCIENCE, 2024, 4 (4) :274-284

[5] Strategy evolution on dynamic networks [J].

Su, Qi ;

McAvoy, Alex ;

Plotkin, Joshua B. .

NATURE COMPUTATIONAL SCIENCE, 2023, 3 (09) :763-+

[6] Nash-Minmax Strategy for Multiplayer Multiagent Graphical Games With Reinforcement Learning [J].

Lian, Bosen ;

Xue, Wenqian ;

Lewis, Frank L. ;

Davoudi, Ali .

IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2025, 12 (01) :763-775

[7] Evolutionary Dynamics of Continuous Strategy Games on Social Networks under Weak Selection: A Preliminary Study [J].

Zhong, Weicai ;

Zhang, Yang ;

Liu, Jing .

2011 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2011, :2514-2518

[8] An Adaptive Strategy via Reinforcement Learning for the Prisoner's Dilemma Game [J].

Xue, Lei ;

Sun, Changyin ;

Wunsch, Donald ;

Zhou, Yingjiang ;

Yu, Fang .

IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2018, 5 (01) :301-310

[9] Nonatomic potential games: the continuous strategy case [J].

Cheung, Man-Wah ;

Lahkar, Ratul .

GAMES AND ECONOMIC BEHAVIOR, 2018, 108 :341-362

[10] A model for the evolution of reinforcement learning in fluctuating games [J].

Dridi, Slimane ;

Lehmann, Laurent .

ANIMAL BEHAVIOUR, 2015, 104 :87-114

← 1 2 3 4 5 →