Distributed Optimal Consensus Problem of Input Constrained Nonlinear Discrete-Time MASs: A Mode-Free Reinforcement Learning Approach

被引：0

作者：

Xuan, Shuxing ^{[1
]}

Liang, Hongjing ^{[1
]}

Huang, Shihao ^{[1
]}

Li, Tieshan ^{[1
]}

Sun, Jiayue ^{[2
]}

机构：

[1] Univ Elect Sci & Technol China, Sch Automat Engn, Chengdu 611731, Sichuan, Peoples R China

[2] Northeastern Univ, Coll Informat Sci & Engn, Shenyang 110819, Peoples R China

来源：

IEEE TRANSACTIONS ON CYBERNETICS | 2025年 / 55卷 / 06期

基金：

中国国家自然科学基金;

关键词：

Synchronization; Optimal control; Mathematical models; Actuators; Vectors; Reinforcement learning; Protocols; Consensus control; Vehicle dynamics; System dynamics; Discrete-time multiagent systems (MASs); distributed synchronization; gradual transition control (GTC); input constrained; optimal consensus control; reinforcement learning (RL); DIFFERENTIAL GRAPHICAL GAMES; MULTIAGENT SYSTEMS; COORDINATION;

D O I：

10.1109/TCYB.2025.3562390

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this article, a model-free reinforcement learning (RL) approach is proposed for solving the optimal consensus control issue of nonlinear discrete-time multiagent systems with input constraint. To address the challenge of solving the coupled discrete Hamilton-Jacobi-Bellman (HJB) equation, a RL approach based on actor-critic framework is proposed for optimal consensus control. A well-defined cost function is designed, and the actor and critic networks are updated through online learning to obtain the optimal controllers. Furthermore, the actuator's performance is often limited due to physical constraints. To address such actuator constraints, a gradual transition control (GTC) method is proposed, and update-free and update-weak policies are introduced to further optimize network performance. Additionally, in real-world distributed systems, the actor-critic networks deployed in each agent rely on data from neighboring agents, which necessitates addressing the issue of distributed synchronization. To address this challenge, the synchronization blocking method is designed, which designs additional control signals for each agent to handle these issues. Finally, two simulations under different scenarios are presented to verify the effectiveness of the proposed approach.

引用

页码：2910 / 2923

页数：14

共 41 条

[1] Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof [J].

Al-Tamimi, Asma ;

Lewis, Frank .

2007 IEEE INTERNATIONAL SYMPOSIUM ON APPROXIMATE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING, 2007, :38-+

[2] Event-Triggered Multigradient Recursive Reinforcement Learning Tracking Control for Multiagent Systems [J].

Bai, Weiwei ;

Li, Tieshan ;

Long, Yue ;

Chen, C. L. Philip .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (01) :366-379

[3] An Overview of Recent Progress in the Study of Distributed Multi-Agent Coordination [J].

Cao, Yongcan ;

Yu, Wenwu ;

Ren, Wei ;

Chen, Guanrong .

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2013, 9 (01) :427-438

[4] Cooperative motion planning and control for aerial-ground autonomous systems: Methods and applications [J].

Chai, Runqi ;

Guo, Yunlong ;

Zuo, Zongyu ;

Chen, Kaiyuan ;

Shin, Hyo-Sang ;

Tsourdos, Antonios .

PROGRESS IN AEROSPACE SCIENCES, 2024, 146

[5] A Two Phases Multiobjective Trajectory Optimization Scheme for Multi-UGVs in the Sight of the First Aid Scenario [J].

Chai, Runqi ;

Chen, Kaiyuan ;

Hua, Bikang ;

Lu, Yaoyao ;

Xia, Yuanqing ;

Sun, Xi-Ming ;

Liu, Guo-Ping ;

Liang, Wannian .

IEEE TRANSACTIONS ON CYBERNETICS, 2024, 54 (09) :5078-5091

[6] Multiphase Overtaking Maneuver Planning for Autonomous Ground Vehicles Via a Desensitized Trajectory Optimization Approach [J].

Chai, Runqi ;

Tsourdos, Antonios ;

Chai, Senchun ;

Xia, Yuanqing ;

Savvaris, Al ;

Chen, C. L. Philip .

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2023, 19 (01) :74-87

[7] Design and Experimental Validation of Deep Reinforcement Learning-Based Fast Trajectory Planning and Control for Mobile Robot in Unknown Environment [J].

Chai, Runqi ;

Niu, Hanlin ;

Carrasco, Joaquin ;

Arvin, Farshad ;

Yin, Hujun ;

Lennox, Barry .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (04) :5778-5792

[8] Distributed fixed-time consensus for nonlinear heterogeneous multi-agent systems [J].

Du, Haibo ;

Wen, Guanghui ;

Wu, Di ;

Cheng, Yingying ;

Lu, Jinhu .

AUTOMATICA, 2020, 113

[9] Reinforcement Learning-Based Nearly Optimal Control for Constrained-Input Partially Unknown Systems Using Differentiator [J].

Guo, Xinxin ;

Yan, Weisheng ;

Cui, Rongxin .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (11) :4713-4725

[10] Adaptive-Critic-Based Event-Triggered Intelligent Cooperative Control for a Class of Second-Order Constrained Multiagent Systems [J].

Guo Z. ;

Ren H. ;

Li H. ;

Zhou Q. .

IEEE Transactions on Artificial Intelligence, 2023, 4 (06) :1654-1665

← 1 2 3 4 5 →