Data-Based Optimal Synchronization of Heterogeneous Multiagent Systems in Graphical Games via Reinforcement Learning

被引：19

作者：

Xiong, Chunping ^{[1
]}

Ma, Qian ^{[1
]}

Guo, Jian ^{[1
]}

Lewis, Frank L. ^{[2
]}

机构：

[1] Nanjing Univ Sci & Technol, Sch Automat, Nanjing 210094, Peoples R China

[2] Univ Texas Arlington, UTA Res Inst, Ft Worth, TX 76118 USA

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2024年 / 35卷 / 11期

关键词：

Games; Synchronization; Heuristic algorithms; System dynamics; Optimal control; Artificial neural networks; Performance analysis; Graphical games; heterogeneous MASs; Nash equilibrium; optimal synchronization; reinforcement learning (RL);

D O I：

10.1109/TNNLS.2023.3291542

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This article studies the optimal synchronization of linear heterogeneous multiagent systems (MASs) with partial unknown knowledge of the system dynamics. The object is to realize system synchronization as well as minimize the performance index of each agent. A framework of heterogeneous multiagent graphical games is formulated first. In the graphical games, it is proved that the optimal control policy relying on the solution of the Hamilton-Jacobian-Bellmen (HJB) equation is not only in Nash equilibrium, but also the best response to fixed control policies of its neighbors. To solve the optimal control policy and the minimum value of the performance index, a model-based policy iteration (PI) algorithm is proposed. Then, according to the model-based algorithm, a data-based off-policy integral reinforcement learning (IRL) algorithm is put forward to handle the partially unknown system dynamics. Furthermore, a single-critic neural network (NN) structure is used to implement the data-based algorithm. Based on the data collected by the behavior policy of the data-based off-policy algorithm, the gradient descent method is used to train NNs to approach the ideal weights. In addition, it is proved that all the proposed algorithms are convergent, and the weight-tuning law of the single-critic NNs can promote optimal synchronization. Finally, a numerical example is proposed to show the effectiveness of the theoretical analysis.

引用

页码：15984 / 15992

页数：9

共 43 条

[1] Multi-agent discrete-time graphical games and reinforcement learning solutions [J].

Abouheaf, Mohammed I. ;

Lewis, Frank L. ;

Vamvoudakis, Kyriakos G. ;

Haesaert, Sofie ;

Babuska, Robert .

AUTOMATICA, 2014, 50 (12) :3038-3053

[2] Secondary control of microgrids based on distributed cooperative control of multi-agent systems [J].

Bidram, Ali ;

Davoudi, Ali ;

Lewis, Frank L. ;

Qu, Zhihua .

IET GENERATION TRANSMISSION & DISTRIBUTION, 2013, 7 (08) :822-831

[3] Observer-Based Dynamic Event-Triggered Control for Multiagent Systems With Time-Varying Delay [J].

Cao, Liang ;

Pan, Yingnan ;

Liang, Hongjing ;

Huang, Tingwen .

IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (05) :3376-3387

[4] Distributed Differential Games for Control of Multi-Agent Systems [J].

Cappello, Domenico ;

Mylvaganam, Thulasi .

IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2022, 9 (02) :635-646

[5]

Ferber J., 1999, Multi-agent Systems: An Introduction to Distributed Artificial Intelligence, V1

[6] Reinforcement Learning-Based Cooperative Optimal Output Regulation via Distributed Adaptive Internal Model [J].

Gao, Weinan ;

Mynuddin, Mohammed ;

Wunsch, Donald C. ;

Jiang, Zhong-Ping .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (10) :5229-5240

[7] Leader-to-Formation Stability of Multiagent Systems: An Adaptive Optimal Control Approach [J].

Gao, Weinan ;

Jiang, Zhong-Ping ;

Lewis, Frank L. ;

Wang, Yebin .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2018, 63 (10) :3581-3587

[8] Secure impulsive synchronization control of multi-agent systems under deception attacks [J].

He, Wangli ;

Gao, Xiaoyang ;

Zhong, Weimin ;

Qian, Feng .

INFORMATION SCIENCES, 2018, 459 :354-368

[9] Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics [J].

Jiang, Yu ;

Jiang, Zhong-Ping .

AUTOMATICA, 2012, 48 (10) :2699-2704

[10] Multi-agent zero-sum differential graphical games for disturbance rejection in distributed control [J].

Jiao, Qiang ;

Modares, Hamidreza ;

Xu, Shengyuan ;

Lewis, Frank L. ;

Vamvoudakis, Kyriakos G. .

AUTOMATICA, 2016, 69 :24-34

← 1 2 3 4 5 →