Model-Free Algorithms for Containment Control of Saturated Discrete-Time Multiagent Systems via Q-Learning Method

被引:19
作者
Long, Mingkang [1 ,2 ]
Su, Housheng [1 ,2 ]
Zeng, Zhigang [1 ,2 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Wuhan 430074, Peoples R China
[2] Huazhong Univ Sci & Technol, Key Lab Image Proc & Intelligent Control, Educ Minist China, Wuhan 430074, Peoples R China
来源
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS | 2022年 / 52卷 / 02期
基金
中国国家自然科学基金;
关键词
Discrete-time; global containment control; input saturation; multiagent systems (MASs); Q-learning (QL) algorithm; DYNAMIC LEADERS; LINEAR-SYSTEMS; NETWORKS;
D O I
10.1109/TSMC.2020.3019504
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this article, we propose two model-free algorithms using state or output feedback for saturated discrete-time multiagent systems (SDTMASs) to attain global containment control. In most previous works, the control input can avoid saturation by utilizing the low gain feedback (LGF) method whereas requiring the knowledge of agent dynamics, and SDTMASs just can attain semi-global containment control. Distinct with the previous works, first, based on the Q-learning (QL) technique, this article defines a Q-function and deduces the corresponding QL Bellman equation, which is the most important part of the QL algorithm. Then, in order to solve the QL Bellman equation, we propose two iterative model-free algorithms using state and output feedback, and the LGF matrix can be acquired from that solution directly. Furthermore, under the state and output feedback control protocols with the feedback matrices obtained from the proposed model-free algorithms, the SDTMASs can achieve global containment control instead of semi-global containment control. Finally, we present some simulations to confirm the validity of the proposed algorithms.
引用
收藏
页码:1308 / 1316
页数:9
相关论文
共 34 条
[1]   Discrete-time dynamic graphical games: model-free reinforcement learning solution [J].
Abouheaf M.I. ;
Lewis F.L. ;
Mahmoud M.S. ;
Mikulski D.G. .
Control Theory and Technology, 2015, 13 (01) :55-69
[2]   Distributed containment control with multiple stationary or dynamic leaders in fixed and switching directed networks [J].
Cao, Yongcan ;
Ren, Wei ;
Egerstedt, Magnus .
AUTOMATICA, 2012, 48 (08) :1586-1597
[3]   Distributed Containment Control for Multiple Autonomous Vehicles With Double-Integrator Dynamics: Algorithms and Experiments [J].
Cao, Yongcan ;
Stuart, Daniel ;
Ren, Wei ;
Meng, Ziyang .
IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, 2011, 19 (04) :929-938
[4]   Stabilizing Solution and Parameter Dependence of Modified Algebraic Riccati EquationWith Application to Discrete-Time Network Synchronization [J].
Chen, Michael Z. Q. ;
Zhang, Liangyin ;
Su, Housheng ;
Chen, Guanrong .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2016, 61 (01) :228-233
[5]   Multitarget Tracking Control for Coupled Heterogeneous Inertial Agents Systems Based on Flocking Behavior [J].
Chen, Shiming ;
Pei, Huiqin ;
Lai, Qiang ;
Yan, Huaicheng .
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2019, 49 (12) :2605-2611
[6]   Leader-follower cooperative attitude control of multiple rigid bodies [J].
Dimarogonas, Dimos V. ;
Tsiotras, Panagiotis ;
Kyriakopoulos, Kostas J. .
SYSTEMS & CONTROL LETTERS, 2009, 58 (06) :429-435
[7]   Containment control in mobile networks [J].
Ji, M. ;
Ferrari-Trecate, G. ;
Egerstedt, M. ;
Buffa, A. .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2008, 53 (08) :1972-1975
[8]   Output synchronization of heterogeneous discrete-time systems: A model-free optimal approach* [J].
Kiumarsi, Bahare ;
Lewis, Frank L. .
AUTOMATICA, 2017, 84 :86-94
[9]  
Landelius T., 1997, Reinforcement Learning and Distributed Local Model Synthesis
[10]   Reinforcement Learning and Feedback Control USING NATURAL DECISION METHODS TO DESIGN OPTIMAL ADAPTIVE CONTROLLERS [J].
Lewis, Frank L. ;
Vrabie, Draguna ;
Vamvoudakis, Kyriakos G. .
IEEE CONTROL SYSTEMS MAGAZINE, 2012, 32 (06) :76-105