Finite-time safe reinforcement learning control of multi-player nonzero-sum game for quadcopter systems

被引：1

作者：

Tan, Junkai ^{[1
,2
]}

Xue, Shuangsi ^{[1
,2
]}

Guan, Qingshu ^{[1
,2
]}

Qu, Kai ^{[1
,2
]}

Cao, Hui ^{[1
,2
]}

机构：

[1] Xi An Jiao Tong Univ, Sch Elect Engn, Xian 710049, Peoples R China

[2] Xi An Jiao Tong Univ, State Key Lab, Xian 710049, Peoples R China

来源：

INFORMATION SCIENCES | 2025年 / 712卷

基金：

中国博士后科学基金;

关键词：

Finite-time optimal control; Nonzero-sum game; Reinforcement learning; Neural network; Dynamic event-trigger; Adaptive dynamic programming; SYNCHRONIZATION;

D O I：

10.1016/j.ins.2025.122117

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper investigates a finite-time safe reinforcement learning control algorithm for multi-player nonzero-sum games (FT-SRL-NZS). In addressing the finite-time safe optimal control issue, value functions incorporating designated barrier functions for the involved players are established within the transformed finite-time stable space. The finite-time safe optimal controller is derived from the solution to the transformed Nash equilibrium condition. An actor-critic structure is proposed for solving the Hamilton-Jacobi-Bellman (HJB) equation in the finite-time stable space, aimed at approximating the finite-time optimal value and its corresponded controller using a novel finite-time concurrent learning update law. A dynamic event-trigger rule adjusts the trigger condition in real time, thereby minimizing the computational and communicative demands associated with calculating Nash equilibrium. Lyapunov stability analysis is employed to examine the finite-time equilibrium of the closed-loop system. Numerical simulations and unmanned aerial vehicle (UAV) hardware tests are carried out to illustrate the efficacy of the proposed finite-time safe control algorithm.

引用

页数：21

共 50 条

[21] Off-policy integral reinforcement learning algorithm in dealing with nonzero sum game for nonlinear distributed parameter systems
Ren, He
Dai, Jing
Zhang, Huaguang
Zhang, Kun
TRANSACTIONS OF THE INSTITUTE OF MEASUREMENT AND CONTROL, 2020, 42 (15) : 2919 - 2928
[22] Finite-time neuro-optimal control for constrained nonlinear systems through reinforcement learning
Bian, Jinshan
Xia, Hongbing
Yi, Jun
Mu, Chaoxu
Si, Chenyi
NEUROCOMPUTING, 2025, 637
[23] Event-triggered optimal control for discrete-time multi-player non-zero-sum games using parallel control
Lu, Jingwei
Wei, Qinglai
Wang, Ziyang
Zhou, Tianmin
Wang, Fei-Yue
INFORMATION SCIENCES, 2022, 584 : 519 - 535
[24] H∞ Control for Discrete-Time Multi-Player Systems via Off-Policy Q-Learning
Li, Jinna
Xiao, Zhenfei
IEEE ACCESS, 2020, 8 (08): : 28831 - 28846
[25] Decentralized optimal large scale multi-player pursuit-evasion strategies: A mean field game approach with reinforcement learning
Zhou, Zejian
Xu, Hao
NEUROCOMPUTING, 2022, 484 : 46 - 58
[26] Event-triggered neural experience replay learning for nonzero-sum tracking games of unknown continuous-time nonlinear systems
Cui, Xiaohong
Peng, Binbin
Wang, Binrui
Wang, Lina
INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2023, 33 (12) : 6553 - 6575
[27] Off-Policy Q-Learning for Anti-Interference Control of Multi-Player Systems
Li, Jinna
Xiao, Zhenfei
Chai, Tianyou
Lewis, Frank L.
Jagannathan, Sarangapani
IFAC PAPERSONLINE, 2020, 53 (02): : 9189 - 9194
[28] Output Feedback H∞ Control for Linear Discrete-Time Multi-Player Systems With Multi-Source Disturbances Using Off-Policy Q-Learning
Xiao, Zhenfei
Li, Jinna
Li, Ping
IEEE ACCESS, 2020, 8 : 208938 - 208951
[29] Neural-network-based learning algorithms for cooperative games of discrete-time multi-player systems with control constraints via adaptive dynamic programming
Jiang, He
Zhang, Huaguang
Xie, Xiangpeng
Han, Ji
NEUROCOMPUTING, 2019, 344 : 13 - 19
[30] Finite-time Optimal Formation Control for Linear Multi-agent Systems
Liu Yongfang
Geng Zhiyong
2014 33RD CHINESE CONTROL CONFERENCE (CCC), 2014, : 8935 - 8940

← 1 2 3 4 5 →