Optimal Evolution Strategy for Continuous Strategy Games on Complex Networks via Reinforcement Learning

被引：1

作者：

Fan, Litong ^{[1
,2
]}

Yu, Dengxiu ^{[2
]}

Cheong, Kang Hao ^{[3
]}

Wang, Zhen ^{[4
,5
]}

机构：

[1] Northwestern Polytech Univ, Sch Mech Engn, Xian 710072, Peoples R China

[2] Northwestern Polytech Univ, Sch Artificial Intelligence Opt & Elect iOPEN, Xian 710072, Peoples R China

[3] Singapore Univ Technol & Design, Sci Math & Technol Cluster, Singapore 487372, Singapore

[4] Northwestern Polytech Univ, Sch Mech Engn iOPEN, Xian 710072, Peoples R China

[5] Northwestern Polytech Univ, Sch Cybersecur, Xian 710072, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2024年

基金：

中国国家自然科学基金;

关键词：

Continuous strategy games; evolutionary dynamic; Hamilton-Jacobi-Bellman (HJB); reinforcement learning (RL); strategy updating rules; COOPERATION; DYNAMICS; EMERGENCE;

D O I：

10.1109/TNNLS.2024.3453385

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This article presents an optimal evolution strategy for continuous strategy games on complex networks via reinforcement learning (RL). In the past, evolutionary game theory usually assumed that agents use the same selection intensity when interacting, ignoring the differences in their learning abilities and learning willingness. Individuals are reluctant to change their strategies too much. Therefore, we design an adaptive strategy updating framework with various selection intensities for continuous strategy games on complex networks based on imitation dynamics, allowing agents to achieve the optimal state and a higher cooperation level with the minimal strategy changes. The optimal updating strategy is acquired using a coupled Hamilton-Jacobi-Bellman (HJB) equation by minimizing the performance function. This function aims to maximize individual payoffs while minimizing strategy changes. Furthermore, a value iteration (VI) RL algorithm is proposed to approximate the HJB solutions and learn the optimal strategy updating rules. The RL algorithm employs actor and critic neural networks to approximate strategy changes and performance functions, along with the gradient descent weight update approach. Meanwhile, the stability and convergence of the proposed methods have been proved by the designed Lyapunov function. Simulations validate the convergence and effectiveness of the proposed methods in different games and complex networks.

引用

页数：13

共 50 条

[1] Improved Reinforcement Learning in Asymmetric Real-time Strategy Games via Strategy Diversity
Dasgupta, Prithviraj
Kliem, John
INTERNATIONAL JOURNAL OF SERIOUS GAMES, 2023, 10 (01): : 19 - 38
[2] Integral-Reinforcement-Learning-Based Hierarchical Optimal Evolutionary Strategy for Continuous Action Social Dilemma Games
Fan, Litong
Yu, Dengxiu
Wang, Zhen
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, 11 (05): : 6807 - 6818
[3] Memetic Evolution Strategy for Reinforcement Learning
Qu, Xinghua
Ong, Yew-Soon
Hou, Yaqing
Shen, Xiaobo
2019 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2019, : 1922 - 1928
[4] Tabular Reinforcement Learning in Real-Time Strategy Games via Options
Tavares, Anderson R.
Chaimowicz, Luiz
PROCEEDINGS OF THE 2018 IEEE CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND GAMES (CIG'18), 2018, : 229 - 236
[5] Learning Adaptive Graph Protection Strategy on Dynamic Networks via Reinforcement Learning
Wijayanto, Arie Wahyu
Murata, Tsuyoshi
2018 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2018), 2018, : 534 - 539
[6] Strategy Acquisition for Games Based on Simplified Reinforcement Learning Using a Strategy Network
Kanakubo, Masaaki
Hagiwara, Masafumi
JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2005, 9 (02) : 203 - 210
[7] Real Time Strategy Games: A Reinforcement Learning Approach
Sethy, Harshit
Patel, Amit
Padmanabhan, Vineet
ELEVENTH INTERNATIONAL CONFERENCE ON COMMUNICATION NETWORKS, ICCN 2015/INDIA ELEVENTH INTERNATIONAL CONFERENCE ON DATA MINING AND WAREHOUSING, ICDMW 2015/NDIA ELEVENTH INTERNATIONAL CONFERENCE ON IMAGE AND SIGNAL PROCESSING, ICISP 2015, 2015, 54 : 257 - 264
[8] OPTIMAL STRATEGY SETS FOR CONTINUOUS 2 PERSON GAMES
CHIN, H
PARTHASARATHY, T
RAGHAVAN, TES
SANKHYA-THE INDIAN JOURNAL OF STATISTICS SERIES A, 1976, 38 (JAN): : 92 - 98
[9] Deriving the Optimal Strategy for the Two Dice Pig Game via Reinforcement Learning
Zhu, Tian
Ma, Merry H.
STATS, 2022, 5 (03): : 805 - 818
[10] Complex equipment troubleshooting strategy generation based on Bayesian networks and reinforcement learning
Liu B.
Yu J.
Han D.
Tang D.
Li X.
Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics, 2024, 50 (04): : 1354 - 1364

← 1 2 3 4 5 →