Data-driven distributed output consensus control for multi-agent systems with unknown internal state

被引:1
作者
Zhang, Cuijuan [1 ,2 ]
Ji, Lianghao [1 ]
Yang, Shasha [1 ]
Guo, Xing [1 ]
Li, Huaqing [3 ]
机构
[1] Chongqing Univ Posts & Telecommun, Chongqing Key Lab Image Cognit, Chongqing 400065, Peoples R China
[2] Anqing Normal Univ, Key Lab Intelligent Percept & Comp Anhui Prov, Anqing 246133, Peoples R China
[3] Southwest Univ, Coll Elect & Informat Engn, Chongqing 400715, Peoples R China
基金
中国国家自然科学基金;
关键词
ECONOMIC-DISPATCH; OPTIMIZATION; SYNCHRONIZATION; NETWORKS; AGENTS;
D O I
10.1016/j.neucom.2024.128868
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper discusses the topic of optimal output consensus control of linear time-invariant discrete-time multi-agent systems (MASs). The attainment of optimal output consensus control for MASs hinges upon the resolution of the interconnected Hamilton-Jacobi-Bellman equation, a task typically precluded by the inherent intractability of analytical solutions. Furthermore, most real-world systems are too complex to obtain the internal states of systems. To address these issues, a modified deep Q-learning network is constructed using current and historical system data rather than a precise model of the system. First, reconstructing the internal state of each agent using an adaptive distributed observer based on output feedback prevents the system instability brought on by the augmented systems. Then the local error system of the agent can be redefined. Based on the redefined error system, a data-driven adaptive dynamic programming (ADP) method is introduced, realized by using the actor-critic neural network structure. In addition, an experience replay strategy is proposed to reduce the propagation of estimation bias and improve the learning speed. Finally, the comparative numerical simulations substantiate the efficacy of the proposed algorithm in a quantifiable manner.
引用
收藏
页数:10
相关论文
共 34 条
[1]   Adaptive dynamic programming for data-based optimal state regulation with experience replay [J].
An, Chen ;
Zhou, Jiaxi .
NEUROCOMPUTING, 2023, 554
[2]  
Busoniu L, 2010, AUTOM CONTROL ENG SE, P1, DOI 10.1201/9781439821091-f
[3]   Practical Synchronization in Networks of Nonlinear Heterogeneous Agents With Application to Power Systems [J].
Chowdhury, Dhrubajit ;
Khalil, Hassan K. .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2021, 66 (01) :184-198
[4]   Distributed Fixed-Time Optimization in Economic Dispatch Over Directed Networks [J].
Dai, Hao ;
Jia, Jinping ;
Yan, Li ;
Fang, Xinpeng ;
Chen, Weisheng .
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2021, 17 (05) :3011-3019
[5]   Approximate dynamic programming for stochastic N-stage optimization with application to optimal consumption under uncertainty [J].
Gaggero, Mauro ;
Gnecco, Giorgio ;
Sanguineti, Marcello .
COMPUTATIONAL OPTIMIZATION AND APPLICATIONS, 2014, 58 (01) :31-85
[6]   Dynamic Programming and Value-Function Approximation in Sequential Decision Problems: Error Analysis and Numerical Results [J].
Gaggero, Mauro ;
Gnecco, Giorgio ;
Sanguineti, Marcello .
JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2013, 156 (02) :380-416
[7]   Suboptimal Solutions to Dynamic Optimization Problems via Approximations of the Policy Functions [J].
Gnecco, G. ;
Sanguineti, M. .
JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2010, 146 (03) :764-794
[8]   Adaptive critic design for nonlinear multi-player zero-sum games with unknown dynamics and control constraints [J].
Huo, Yu ;
Wang, Ding ;
Qiao, Junfei ;
Li, Menghua .
NONLINEAR DYNAMICS, 2023, 111 (12) :11671-11683
[9]   Data-Driven Distributed Output Consensus Control for Partially Observable Multiagent Systems [J].
Jiang, He ;
He, Haibo .
IEEE TRANSACTIONS ON CYBERNETICS, 2019, 49 (03) :848-858
[10]   Optimal Tracking Control of Unknown Discrete-Time Linear Systems Using Input-Output Measured Data [J].
Kiumarsi, Bahare ;
Lewis, Frank L. ;
Naghibi-Sistani, Mohammad-Bagher ;
Karimpour, Ali .
IEEE TRANSACTIONS ON CYBERNETICS, 2015, 45 (12) :2770-2779