Adaptive optimal consensus of nonlinear multi-agent systems with unknown dynamics using off-policy integral reinforcement learning

被引:0
作者
Yan, Lei [1 ]
Liu, Zhi [2 ,4 ]
Chen, C. L. Philip [3 ]
Zhang, Yun [2 ]
Wu, Zongze [2 ]
机构
[1] Nanyang Inst Technol, Sch Intelligent Mfg, Nanyang 473004, Henan, Peoples R China
[2] Guangdong Univ Technol, Sch Automat, Guangzhou 510006, Guangdong, Peoples R China
[3] South China Univ Technol, Fac Comp Sci & Engn, Guangzhou 510006, Guangdong, Peoples R China
[4] Pazhou Lab, Guangzhou 510006, Guangdong, Peoples R China
基金
中国国家自然科学基金;
关键词
Adaptive optimal consensus; Integral reinforcement learning; Off-policy; Completely unknown dynamics; TRACKING CONTROL; GAMES;
D O I
10.1016/j.neucom.2024.129185
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reinforcement learning (RL) has been identified as a promising approach for developing adaptive optimal consensus schemes for high-order strict-feedback nonlinear multi-agent systems (MASs). However, existing methods have limitations, as they can only be applied to systems with partially unknown dynamics and require an identifier-actor-critic framework. This paper proposes a novel approach that combines classical backstepping techniques and off-policy integral reinforcement learning (IRL) to circumvent these limitations and develop an adaptive optimal consensus scheme for nonlinear MASs with completely unknown dynamics. Specifically, we introduce an off-policy IRL-based adaptive optimal consensus scheme that can obtain optimal control inputs without knowledge of the system dynamics. The algorithm utilizes the actor-critic structure and updates the weight vectors using only one learning rule in each step based on the collected system trajectory data. We have proven that the optimal consensus is achieved, and the estimation errors of the optimal weight vectors are uniformly ultimately bounded (UUB). Finally, we present a simulation example to validate the effectiveness of the proposed approach.
引用
收藏
页数:10
相关论文
共 45 条
  • [1] Communication-Aware Formation Control of AUVs With Model Uncertainty and Fading Channel via Integral Reinforcement Learning
    Cao, Wenqiang
    Yan, Jing
    Yang, Xian
    Luo, Xiaoyuan
    Guan, Xinping
    [J]. IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2023, 10 (01) : 159 - 176
  • [2] Adaptive optimal output tracking of continuous-time systems via output-feedback-based reinforcement learning
    Chen, Ci
    Xie, Lihua
    Xie, Kan
    Lewis, Frank L.
    Xie, Shengli
    [J]. AUTOMATICA, 2022, 146
  • [3] Homotopic policy iteration-based learning design for unknown linear continuous-time systemsx2729;
    Chen, Ci
    Lewis, Frank L.
    Li, Bo
    [J]. AUTOMATICA, 2022, 138
  • [4] Off-policy learning for adaptive optimal output synchronization of heterogeneous multi-agent systems
    Chen, Ci
    Lewis, Frank L.
    Xie, Kan
    Xie, Shengli
    Liu, Yilu
    [J]. AUTOMATICA, 2020, 119
  • [5] Reinforcement Learning-Based Adaptive Optimal Exponential Tracking Control of Linear Systems With Unknown Dynamics
    Chen, Ci
    Modares, Hamidreza
    Xie, Kan
    Lewis, Frank L.
    Wan, Yan
    Xie, Shengli
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2019, 64 (11) : 4423 - 4438
  • [6] Adaptive-Critic-Based Event-Triggered Intelligent Cooperative Control for a Class of Second-Order Constrained Multiagent Systems
    Guo Z.
    Ren H.
    Li H.
    Zhou Q.
    [J]. IEEE Transactions on Artificial Intelligence, 2023, 4 (06): : 1654 - 1665
  • [7] Leader-Follower Flocking of Multiple Robotic Fish
    Jia, Yongnan
    Wang, Long
    [J]. IEEE-ASME TRANSACTIONS ON MECHATRONICS, 2015, 20 (03) : 1372 - 1383
  • [8] Reinforcement learning and cooperative H? output regulation of linear continuous-time multi-agent systems
    Jiang, Yi
    Gao, Weinan
    Wu, Jin
    Chai, Tianyou
    Lewis, Frank L.
    [J]. AUTOMATICA, 2023, 148
  • [9] Optimal and Autonomous Control Using Reinforcement Learning: A Survey
    Kiumarsi, Bahare
    Vamvoudakis, Kyriakos G.
    Modares, Hamidreza
    Lewis, Frank L.
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (06) : 2042 - 2062
  • [10] Reinforcement Learning and Adaptive Dynamic Programming for Feedback Control
    Lewis, Frank L.
    Vrabie, Draguna
    [J]. IEEE CIRCUITS AND SYSTEMS MAGAZINE, 2009, 9 (03) : 32 - 50