Data-driven distributed output consensus control for multi-agent systems with unknown internal state

被引:0
作者
Zhang, Cuijuan [1 ,2 ]
Ji, Lianghao [1 ]
Yang, Shasha [1 ]
Guo, Xing [1 ]
Li, Huaqing [3 ]
机构
[1] Chongqing Univ Posts & Telecommun, Chongqing Key Lab Image Cognit, Chongqing 400065, Peoples R China
[2] Anqing Normal Univ, Key Lab Intelligent Percept & Comp Anhui Prov, Anqing 246133, Peoples R China
[3] Southwest Univ, Coll Elect & Informat Engn, Chongqing 400715, Peoples R China
基金
中国国家自然科学基金;
关键词
ECONOMIC-DISPATCH; OPTIMIZATION; SYNCHRONIZATION; NETWORKS; AGENTS;
D O I
10.1016/j.neucom.2024.128868
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper discusses the topic of optimal output consensus control of linear time-invariant discrete-time multi-agent systems (MASs). The attainment of optimal output consensus control for MASs hinges upon the resolution of the interconnected Hamilton-Jacobi-Bellman equation, a task typically precluded by the inherent intractability of analytical solutions. Furthermore, most real-world systems are too complex to obtain the internal states of systems. To address these issues, a modified deep Q-learning network is constructed using current and historical system data rather than a precise model of the system. First, reconstructing the internal state of each agent using an adaptive distributed observer based on output feedback prevents the system instability brought on by the augmented systems. Then the local error system of the agent can be redefined. Based on the redefined error system, a data-driven adaptive dynamic programming (ADP) method is introduced, realized by using the actor-critic neural network structure. In addition, an experience replay strategy is proposed to reduce the propagation of estimation bias and improve the learning speed. Finally, the comparative numerical simulations substantiate the efficacy of the proposed algorithm in a quantifiable manner.
引用
收藏
页数:10
相关论文
共 34 条
  • [1] Busoniu L, 2010, AUTOM CONTROL ENG SE, P1, DOI 10.1201/9781439821091-f
  • [2] Practical Synchronization in Networks of Nonlinear Heterogeneous Agents With Application to Power Systems
    Chowdhury, Dhrubajit
    Khalil, Hassan K.
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2021, 66 (01) : 184 - 198
  • [3] Distributed Fixed-Time Optimization in Economic Dispatch Over Directed Networks
    Dai, Hao
    Jia, Jinping
    Yan, Li
    Fang, Xinpeng
    Chen, Weisheng
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2021, 17 (05) : 3011 - 3019
  • [4] Approximate dynamic programming for stochastic N-stage optimization with application to optimal consumption under uncertainty
    Gaggero, Mauro
    Gnecco, Giorgio
    Sanguineti, Marcello
    [J]. COMPUTATIONAL OPTIMIZATION AND APPLICATIONS, 2014, 58 (01) : 31 - 85
  • [5] Dynamic Programming and Value-Function Approximation in Sequential Decision Problems: Error Analysis and Numerical Results
    Gaggero, Mauro
    Gnecco, Giorgio
    Sanguineti, Marcello
    [J]. JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2013, 156 (02) : 380 - 416
  • [6] Suboptimal Solutions to Dynamic Optimization Problems via Approximations of the Policy Functions
    Gnecco, G.
    Sanguineti, M.
    [J]. JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2010, 146 (03) : 764 - 794
  • [7] Adaptive critic design for nonlinear multi-player zero-sum games with unknown dynamics and control constraints
    Huo, Yu
    Wang, Ding
    Qiao, Junfei
    Li, Menghua
    [J]. NONLINEAR DYNAMICS, 2023, 111 (12) : 11671 - 11683
  • [8] Data-Driven Distributed Output Consensus Control for Partially Observable Multiagent Systems
    Jiang, He
    He, Haibo
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2019, 49 (03) : 848 - 858
  • [9] Optimal Tracking Control of Unknown Discrete-Time Linear Systems Using Input-Output Measured Data
    Kiumarsi, Bahare
    Lewis, Frank L.
    Naghibi-Sistani, Mohammad-Bagher
    Karimpour, Ali
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2015, 45 (12) : 2770 - 2779
  • [10] Reinforcement Learning for Partially Observable Dynamic Processes: Adaptive Dynamic Programming Using Measured Output Data
    Lewis, F. L.
    Vamvoudakis, Kyriakos G.
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2011, 41 (01): : 14 - 25