Finite-time safe reinforcement learning control of multi-player nonzero-sum game for quadcopter systems

被引:1
作者
Tan, Junkai [1 ,2 ]
Xue, Shuangsi [1 ,2 ]
Guan, Qingshu [1 ,2 ]
Qu, Kai [1 ,2 ]
Cao, Hui [1 ,2 ]
机构
[1] Xi An Jiao Tong Univ, Sch Elect Engn, Xian 710049, Peoples R China
[2] Xi An Jiao Tong Univ, State Key Lab, Xian 710049, Peoples R China
基金
中国博士后科学基金;
关键词
Finite-time optimal control; Nonzero-sum game; Reinforcement learning; Neural network; Dynamic event-trigger; Adaptive dynamic programming; SYNCHRONIZATION;
D O I
10.1016/j.ins.2025.122117
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper investigates a finite-time safe reinforcement learning control algorithm for multi-player nonzero-sum games (FT-SRL-NZS). In addressing the finite-time safe optimal control issue, value functions incorporating designated barrier functions for the involved players are established within the transformed finite-time stable space. The finite-time safe optimal controller is derived from the solution to the transformed Nash equilibrium condition. An actor-critic structure is proposed for solving the Hamilton-Jacobi-Bellman (HJB) equation in the finite-time stable space, aimed at approximating the finite-time optimal value and its corresponded controller using a novel finite-time concurrent learning update law. A dynamic event-trigger rule adjusts the trigger condition in real time, thereby minimizing the computational and communicative demands associated with calculating Nash equilibrium. Lyapunov stability analysis is employed to examine the finite-time equilibrium of the closed-loop system. Numerical simulations and unmanned aerial vehicle (UAV) hardware tests are carried out to illustrate the efficacy of the proposed finite-time safe control algorithm.
引用
收藏
页数:21
相关论文
共 50 条
  • [21] Off-policy integral reinforcement learning algorithm in dealing with nonzero sum game for nonlinear distributed parameter systems
    Ren, He
    Dai, Jing
    Zhang, Huaguang
    Zhang, Kun
    TRANSACTIONS OF THE INSTITUTE OF MEASUREMENT AND CONTROL, 2020, 42 (15) : 2919 - 2928
  • [22] Finite-time neuro-optimal control for constrained nonlinear systems through reinforcement learning
    Bian, Jinshan
    Xia, Hongbing
    Yi, Jun
    Mu, Chaoxu
    Si, Chenyi
    NEUROCOMPUTING, 2025, 637
  • [23] Event-triggered optimal control for discrete-time multi-player non-zero-sum games using parallel control
    Lu, Jingwei
    Wei, Qinglai
    Wang, Ziyang
    Zhou, Tianmin
    Wang, Fei-Yue
    INFORMATION SCIENCES, 2022, 584 : 519 - 535
  • [24] H∞ Control for Discrete-Time Multi-Player Systems via Off-Policy Q-Learning
    Li, Jinna
    Xiao, Zhenfei
    IEEE ACCESS, 2020, 8 (08): : 28831 - 28846
  • [25] Decentralized optimal large scale multi-player pursuit-evasion strategies: A mean field game approach with reinforcement learning
    Zhou, Zejian
    Xu, Hao
    NEUROCOMPUTING, 2022, 484 : 46 - 58
  • [26] Event-triggered neural experience replay learning for nonzero-sum tracking games of unknown continuous-time nonlinear systems
    Cui, Xiaohong
    Peng, Binbin
    Wang, Binrui
    Wang, Lina
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2023, 33 (12) : 6553 - 6575
  • [27] Off-Policy Q-Learning for Anti-Interference Control of Multi-Player Systems
    Li, Jinna
    Xiao, Zhenfei
    Chai, Tianyou
    Lewis, Frank L.
    Jagannathan, Sarangapani
    IFAC PAPERSONLINE, 2020, 53 (02): : 9189 - 9194
  • [28] Output Feedback H∞ Control for Linear Discrete-Time Multi-Player Systems With Multi-Source Disturbances Using Off-Policy Q-Learning
    Xiao, Zhenfei
    Li, Jinna
    Li, Ping
    IEEE ACCESS, 2020, 8 : 208938 - 208951
  • [29] Neural-network-based learning algorithms for cooperative games of discrete-time multi-player systems with control constraints via adaptive dynamic programming
    Jiang, He
    Zhang, Huaguang
    Xie, Xiangpeng
    Han, Ji
    NEUROCOMPUTING, 2019, 344 : 13 - 19
  • [30] Finite-time Optimal Formation Control for Linear Multi-agent Systems
    Liu Yongfang
    Geng Zhiyong
    2014 33RD CHINESE CONTROL CONFERENCE (CCC), 2014, : 8935 - 8940