Finite-Horizon Optimal Consensus Control for Unknown Multiagent State-Delay Systems

被引：31

作者：

Zhang, Huaipin ^{[1
,2
]}

Park, Ju H. ^{[2
]}

Yue, Dong ^{[1
]}

Xie, Xiangpeng ^{[1
]}

机构：

[1] Nanjing Univ Posts & Telecommun, Inst Adv Technol, Nanjing 210023, Peoples R China

[2] Yeungnam Univ, Dept Elect Engn, Gyongsan 38541, South Korea

来源：

IEEE TRANSACTIONS ON CYBERNETICS | 2020年 / 50卷 / 02期

基金：

新加坡国家研究基金会; 中国国家自然科学基金;

关键词：

Mathematical model; Delays; Delay effects; Approximation algorithms; Adaptation models; Performance analysis; Multi-agent systems; Finite-horizon; multiagent systems (MASs); off-policy reinforcement learning (RL); optimal consensus control; state delays; DIFFERENTIAL GRAPHICAL GAMES; H-INFINITY CONTROL; NONLINEAR-SYSTEMS; AVERAGE CONSENSUS; SYNCHRONIZATION; TOPOLOGIES; DYNAMICS;

D O I：

10.1109/TCYB.2018.2856510

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper investigates finite-horizon optimal consensus control problem for unknown multiagent systems with state delays. It is well known that optimal consensus control is the solutions to the coupled Hamilton-Jacobi-Bellman (HJB) equations. An off-policy reinforcement learning (RL) algorithm is developed to learn the two-stage optimal consensus solutions to the coupled time-varying HJB equations using the measurable state data instead of the knowledge of the state-delayed system dynamics. Subsequently, for each agent, a single critic neural network (NN) is utilized to approximate the time-varying cost function and help to calculate optimal consensus control policy. Based on the method of weighted residuals, adaptive weight update laws for the critic NNs are proposed. Finally, the simulation results are provided to illustrate the effectiveness of the proposed off-policy RL method.

引用

页码：402 / 413

页数：12

共 43 条

[1] Multi-agent discrete-time graphical games and reinforcement learning solutions [J].

Abouheaf, Mohammed I. ;

Lewis, Frank L. ;

Vamvoudakis, Kyriakos G. ;

Haesaert, Sofie ;

Babuska, Robert .

AUTOMATICA, 2014, 50 (12) :3038-3053

[2] Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach [J].

Abu-Khalaf, M ;

Lewis, FL .

AUTOMATICA, 2005, 41 (05) :779-791

[3] A New Self-Training-Based Unsupervised Satellite Image Classification Technique Using Cluster Ensemble Strategy [J].

Banerjee, Biplab ;

Bovolo, Francesca ;

Bhattacharya, Avik ;

Bruzzone, Lorenzo ;

Chaudhuri, Subhasis ;

Mohan, B. Krishna .

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2015, 12 (04) :741-745

[4]

Bokov GV., 2011, J. Math. Sci, V172, P623, DOI [10.1007/s10958-011-0208-y, DOI 10.1007/S10958-011-0208-Y]

[5] Fixed-final-time-constrained optimal control, of Nonlinear systems using neural network HJB approach [J].

Cheng, Tao ;

Lewis, Frank L. ;

Abu-Khalaf, Murad .

IEEE TRANSACTIONS ON NEURAL NETWORKS, 2007, 18 (06) :1725-1737

[6] Multiagent System-Based Distributed Coordinated Control for Radial DC Microgrid Considering Transmission Time Delays [J].

Dou, Chunxia ;

Yue, Dong ;

Guerrero, Josep M. ;

Xie, Xiangpeng ;

Hu, Songlin .

IEEE TRANSACTIONS ON SMART GRID, 2017, 8 (05) :2370-2381

[7] Leader-to-Formation Stability of Multiagent Systems: An Adaptive Optimal Control Approach [J].

Gao, Weinan ;

Jiang, Zhong-Ping ;

Lewis, Frank L. ;

Wang, Yebin .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2018, 63 (10) :3581-3587

[8] Distributed Formation Control of Networked Multi-Agent Systems Using a Dynamic Event-Triggered Communication Mechanism [J].

Ge, Xiaohua ;

Han, Qing-Long .

IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2017, 64 (10) :8118-8127

[9] UNIVERSAL APPROXIMATION OF AN UNKNOWN MAPPING AND ITS DERIVATIVES USING MULTILAYER FEEDFORWARD NETWORKS [J].

HORNIK, K ;

STINCHCOMBE, M ;

WHITE, H .

NEURAL NETWORKS, 1990, 3 (05) :551-560

[10]

Jagannathan S., 2006, Neural Network Control of Nonlinear Discrete-Time Systems

← 1 2 3 4 5 →