Optimal Evolution Strategy for Continuous Strategy Games on Complex Networks via Reinforcement Learning

被引:1
|
作者
Fan, Litong [1 ,2 ]
Yu, Dengxiu [2 ]
Cheong, Kang Hao [3 ]
Wang, Zhen [4 ,5 ]
机构
[1] Northwestern Polytech Univ, Sch Mech Engn, Xian 710072, Peoples R China
[2] Northwestern Polytech Univ, Sch Artificial Intelligence Opt & Elect iOPEN, Xian 710072, Peoples R China
[3] Singapore Univ Technol & Design, Sci Math & Technol Cluster, Singapore 487372, Singapore
[4] Northwestern Polytech Univ, Sch Mech Engn iOPEN, Xian 710072, Peoples R China
[5] Northwestern Polytech Univ, Sch Cybersecur, Xian 710072, Peoples R China
基金
中国国家自然科学基金;
关键词
Continuous strategy games; evolutionary dynamic; Hamilton-Jacobi-Bellman (HJB); reinforcement learning (RL); strategy updating rules; COOPERATION; DYNAMICS; EMERGENCE;
D O I
10.1109/TNNLS.2024.3453385
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article presents an optimal evolution strategy for continuous strategy games on complex networks via reinforcement learning (RL). In the past, evolutionary game theory usually assumed that agents use the same selection intensity when interacting, ignoring the differences in their learning abilities and learning willingness. Individuals are reluctant to change their strategies too much. Therefore, we design an adaptive strategy updating framework with various selection intensities for continuous strategy games on complex networks based on imitation dynamics, allowing agents to achieve the optimal state and a higher cooperation level with the minimal strategy changes. The optimal updating strategy is acquired using a coupled Hamilton-Jacobi-Bellman (HJB) equation by minimizing the performance function. This function aims to maximize individual payoffs while minimizing strategy changes. Furthermore, a value iteration (VI) RL algorithm is proposed to approximate the HJB solutions and learn the optimal strategy updating rules. The RL algorithm employs actor and critic neural networks to approximate strategy changes and performance functions, along with the gradient descent weight update approach. Meanwhile, the stability and convergence of the proposed methods have been proved by the designed Lyapunov function. Simulations validate the convergence and effectiveness of the proposed methods in different games and complex networks.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Improved Reinforcement Learning in Asymmetric Real-time Strategy Games via Strategy Diversity
    Dasgupta, Prithviraj
    Kliem, John
    INTERNATIONAL JOURNAL OF SERIOUS GAMES, 2023, 10 (01): : 19 - 38
  • [2] Integral-Reinforcement-Learning-Based Hierarchical Optimal Evolutionary Strategy for Continuous Action Social Dilemma Games
    Fan, Litong
    Yu, Dengxiu
    Wang, Zhen
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, 11 (05): : 6807 - 6818
  • [3] Memetic Evolution Strategy for Reinforcement Learning
    Qu, Xinghua
    Ong, Yew-Soon
    Hou, Yaqing
    Shen, Xiaobo
    2019 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2019, : 1922 - 1928
  • [4] Tabular Reinforcement Learning in Real-Time Strategy Games via Options
    Tavares, Anderson R.
    Chaimowicz, Luiz
    PROCEEDINGS OF THE 2018 IEEE CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND GAMES (CIG'18), 2018, : 229 - 236
  • [5] Learning Adaptive Graph Protection Strategy on Dynamic Networks via Reinforcement Learning
    Wijayanto, Arie Wahyu
    Murata, Tsuyoshi
    2018 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2018), 2018, : 534 - 539
  • [6] Strategy Acquisition for Games Based on Simplified Reinforcement Learning Using a Strategy Network
    Kanakubo, Masaaki
    Hagiwara, Masafumi
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2005, 9 (02) : 203 - 210
  • [7] Real Time Strategy Games: A Reinforcement Learning Approach
    Sethy, Harshit
    Patel, Amit
    Padmanabhan, Vineet
    ELEVENTH INTERNATIONAL CONFERENCE ON COMMUNICATION NETWORKS, ICCN 2015/INDIA ELEVENTH INTERNATIONAL CONFERENCE ON DATA MINING AND WAREHOUSING, ICDMW 2015/NDIA ELEVENTH INTERNATIONAL CONFERENCE ON IMAGE AND SIGNAL PROCESSING, ICISP 2015, 2015, 54 : 257 - 264
  • [8] OPTIMAL STRATEGY SETS FOR CONTINUOUS 2 PERSON GAMES
    CHIN, H
    PARTHASARATHY, T
    RAGHAVAN, TES
    SANKHYA-THE INDIAN JOURNAL OF STATISTICS SERIES A, 1976, 38 (JAN): : 92 - 98
  • [9] Deriving the Optimal Strategy for the Two Dice Pig Game via Reinforcement Learning
    Zhu, Tian
    Ma, Merry H.
    STATS, 2022, 5 (03): : 805 - 818
  • [10] Complex equipment troubleshooting strategy generation based on Bayesian networks and reinforcement learning
    Liu B.
    Yu J.
    Han D.
    Tang D.
    Li X.
    Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics, 2024, 50 (04): : 1354 - 1364