Reinforcement Q-learning and Optimal Tracking Control of Unknown Discrete-time Multi-player Systems Based on Game Theory

被引:2
作者
Zhao, Jin-Gang [1 ]
机构
[1] Weifang Univ, Inst Intelligent Percept & Optimizat Control Compl, Sch Machinery & Automat, Weifang 261061, Shandong, Peoples R China
关键词
Discrete-time; fully cooperative game (FCG); multi-player systems; Q-learning; tracking control; ZERO-SUM GAMES; NONLINEAR-SYSTEMS;
D O I
10.1007/s12555-022-1133-1
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper studies the fully cooperative game tracking control problem (FCGTCP) for a class of discrete-time multi-player linear systems with unknown dynamics. The reference trajectory is generated by a command generator system. An augmented multi-player systems composed of the origin multi-player systems and the command generator system is constructed, and an exponential discounted cost function is introduced to derive an augmented fully cooperative game tracking algebraic Riccati equation (FCGTARE). When the system dynamics are known, a model-based policy iteration (PI) algorithm is proposed to solve the augmented FCGTARE. Furthermore, to relax the system dynamics, an online reinforcement Q-learning algorithm is designed to obtain the solution to the augmented FCGTARE. The convergence of designed online reinforcement Q-learning algorithm is proved. Finally, two simulation examples are given to verify the validity of the model-based PI algorithm and online reinforcement Q-learning algorithm.
引用
收藏
页码:1751 / 1759
页数:9
相关论文
共 38 条
[1]   Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control [J].
Al-Tamimi, Asma ;
Lewis, Frank L. ;
Abu-Khalaf, Murad .
AUTOMATICA, 2007, 43 (03) :473-481
[2]   Optimal Transmission Power Scheduling of Networked Control Systems Via Fuzzy Adaptive Dynamic Programming [J].
An, Liwei ;
Yang, Guang-Hong .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2021, 29 (06) :1629-1639
[3]   Opacity Enforcement for Confidential Robust Control in Linear Cyber-Physical Systems [J].
An, Liwei ;
Yang, Guang-Hong .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2020, 65 (03) :1234-1241
[4]   Data-Driven Coordinated Attack Policy Design Based on Adaptive L2-Gain Optimal Theory [J].
An, Liwei ;
Yang, Guang-Hong .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2018, 63 (06) :1850-1857
[5]   An adaptive tracking control method with swing suppression for 4-DOF tower crane systems [J].
Chen, He ;
Fang, Yongchun ;
Sun, Ning .
MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2019, 123 :426-442
[6]   Optimal trajectory planning and tracking control method for overhead cranes [J].
Chen, He ;
Fang, Yongchun ;
Sun, Ning .
IET CONTROL THEORY AND APPLICATIONS, 2016, 10 (06) :692-699
[7]   Adaptive tracking control of uncertain MIMO nonlinear systems with input constraints [J].
Chen, Mou ;
Ge, Shuzhi Sam ;
Ren, Beibei .
AUTOMATICA, 2011, 47 (03) :452-465
[8]   Reinforcement Q-learning based on Multirate Generalized Policy Iteration and Its Application to a 2-DOF Helicopter [J].
Chun, Tae Yoon ;
Park, Jin Bae ;
Choi, Yoon Ho .
INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2018, 16 (01) :377-386
[9]   Distributed Adaptive Tracking Control for High-Order Nonlinear Multiagent Systems Over Event-Triggered Communication [J].
Deng, Chao ;
Wen, Changyun ;
Wang, Wei ;
Li, Xinyao ;
Yue, Dong .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (02) :1176-1183
[10]   Distributed Observer-Based Cooperative Control Approach for Uncertain Nonlinear MASs Under Event-Triggered Communication [J].
Deng, Chao ;
Wen, Changyun ;
Huang, Jiangshuai ;
Zhang, Xian-Ming ;
Zou, Ying .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2022, 67 (05) :2669-2676