Optimal consensus control for multi-agent systems: Multi-step policy gradient adaptive dynamic programming method

被引:4
|
作者
Ji, Lianghao [1 ,3 ]
Jian, Kai [1 ]
Zhang, Cuijuan [1 ]
Yang, Shasha [1 ]
Guo, Xing [1 ]
Li, Huaqing [2 ]
机构
[1] Chongqing Univ Posts & Telecommun, Chongqing Key Lab Image Cognit, Chongqing, Peoples R China
[2] Southwest Univ, Coll Elect & Informat Engn, Chongqing, Peoples R China
[3] Chongqing Univ Posts & Telecommun, Chongqing Key Lab Image Cognit, Chongqing 400065, Peoples R China
基金
中国国家自然科学基金;
关键词
complex networks; dynamic programming; intelligent control; multi-agent systems; optimal control; OPTIMAL TRACKING CONTROL; ALGORITHM; FRAMEWORK;
D O I
10.1049/cth2.12473
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a novel adaptive dynamic programming (ADP) method to solve the optimal consensus problem for a class of discrete-time multi-agent systems with completely unknown dynamics. Different from the classical RL-based optimal control algorithms based on one-step temporal difference method, a multi-step-based (also call n-step) policy gradient ADP (MS-PGADP) algorithm, which have been proved to be more efficient owing to its faster propagation of the reward, is proposed to obtain the iterative control policies. Moreover, a novel Q-function is defined, which estimates the performance of performing an action in the current state. Then, through the Lyapunov stability theorem and functional analysis, the proof of optimality of the performance index function is given and the stability of the error system is also proved. Furthermore, the actor-critic neural networks are used to implement the proposed method. Inspired by deep Q network, the target network is also introduced to guarantee the stability of NNs in the process of training. Finally, two simulations are conducted to verify the effectiveness of the proposed algorithm.
引用
收藏
页码:1443 / 1457
页数:15
相关论文
共 50 条
  • [21] Policy Gradient Adaptive Dynamic Programming for Data-Based Optimal Control
    Luo, Biao
    Liu, Derong
    Wu, Huai-Ning
    Wang, Ding
    Lewis, Frank L.
    IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (10) : 3341 - 3354
  • [22] Consensus of multi-agent systems with unknown control directions by uniting dynamic and switching adaptive feedback
    Yu, Linzhen
    Liu, Yungang
    2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 5136 - 5141
  • [23] Optimal Output Synchronization of Nonlinear Multi-agent Systems using Approximate Dynamic Programming
    Modares, Hamidreza
    Lewis, Frank L.
    Davoudi, Ali
    2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 4227 - 4232
  • [24] Adaptive impulsive consensus of multi-agent systems with control gain error
    Zhang, Liuyang
    Li, Teng
    Huang, Tao
    Huang, Junhao
    Ma, Tiedong
    2017 29TH CHINESE CONTROL AND DECISION CONFERENCE (CCDC), 2017, : 4610 - 4615
  • [25] Consensus control of multi-agent systems with delays
    Gong, Yi
    ELECTRONIC RESEARCH ARCHIVE, 2024, 32 (08): : 4887 - 4904
  • [26] An adaptive critic-based scheme for consensus control of nonlinear multi-agent systems
    Heydari, Ali
    Balakrishnan, S. N.
    INTERNATIONAL JOURNAL OF CONTROL, 2014, 87 (12) : 2463 - 2474
  • [27] Distributed Optimal Consensus Control Algorithm for Continuous-Time Multi-Agent Systems
    Wang, Qishao
    Duan, Zhisheng
    Wang, Jingyao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2020, 67 (01) : 102 - 106
  • [28] Temporal difference learning with multi-step returns for intelligent optimal control of dynamic systems☆
    Xin, Peng
    Wang, Ding
    Liu, Ao
    Qiao, Junfei
    NEUROCOMPUTING, 2025, 622
  • [29] Topology Switching for Optimal Leader-Following Consensus of Multi-Agent Systems
    Fan, Tianpeng
    Wan, Quan
    Ding, Zhengtao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2024, 71 (08) : 3845 - 3849
  • [30] Multi-Agent Optimal Consensus With Unknown Control Directions
    Tang, Yutao
    IEEE CONTROL SYSTEMS LETTERS, 2021, 5 (04): : 1201 - 1206