共 38 条
Reinforcement Q-learning and Optimal Tracking Control of Unknown Discrete-time Multi-player Systems Based on Game Theory
被引:2
作者:

Zhao, Jin-Gang
论文数: 0 引用数: 0
h-index: 0
机构:
Weifang Univ, Inst Intelligent Percept & Optimizat Control Compl, Sch Machinery & Automat, Weifang 261061, Shandong, Peoples R China Weifang Univ, Inst Intelligent Percept & Optimizat Control Compl, Sch Machinery & Automat, Weifang 261061, Shandong, Peoples R China
机构:
[1] Weifang Univ, Inst Intelligent Percept & Optimizat Control Compl, Sch Machinery & Automat, Weifang 261061, Shandong, Peoples R China
关键词:
Discrete-time;
fully cooperative game (FCG);
multi-player systems;
Q-learning;
tracking control;
ZERO-SUM GAMES;
NONLINEAR-SYSTEMS;
D O I:
10.1007/s12555-022-1133-1
中图分类号:
TP [自动化技术、计算机技术];
学科分类号:
0812 ;
摘要:
This paper studies the fully cooperative game tracking control problem (FCGTCP) for a class of discrete-time multi-player linear systems with unknown dynamics. The reference trajectory is generated by a command generator system. An augmented multi-player systems composed of the origin multi-player systems and the command generator system is constructed, and an exponential discounted cost function is introduced to derive an augmented fully cooperative game tracking algebraic Riccati equation (FCGTARE). When the system dynamics are known, a model-based policy iteration (PI) algorithm is proposed to solve the augmented FCGTARE. Furthermore, to relax the system dynamics, an online reinforcement Q-learning algorithm is designed to obtain the solution to the augmented FCGTARE. The convergence of designed online reinforcement Q-learning algorithm is proved. Finally, two simulation examples are given to verify the validity of the model-based PI algorithm and online reinforcement Q-learning algorithm.
引用
收藏
页码:1751 / 1759
页数:9
相关论文
共 38 条
[1]
Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control
[J].
Al-Tamimi, Asma
;
Lewis, Frank L.
;
Abu-Khalaf, Murad
.
AUTOMATICA,
2007, 43 (03)
:473-481

Al-Tamimi, Asma
论文数: 0 引用数: 0
h-index: 0
机构:
Univ Texas, Automat & Robot Res Inst, Arlington, TX 76118 USA Univ Texas, Automat & Robot Res Inst, Arlington, TX 76118 USA

Lewis, Frank L.
论文数: 0 引用数: 0
h-index: 0
机构:
Univ Texas, Automat & Robot Res Inst, Arlington, TX 76118 USA Univ Texas, Automat & Robot Res Inst, Arlington, TX 76118 USA

Abu-Khalaf, Murad
论文数: 0 引用数: 0
h-index: 0
机构:
Univ Texas, Automat & Robot Res Inst, Arlington, TX 76118 USA Univ Texas, Automat & Robot Res Inst, Arlington, TX 76118 USA
[2]
Optimal Transmission Power Scheduling of Networked Control Systems Via Fuzzy Adaptive Dynamic Programming
[J].
An, Liwei
;
Yang, Guang-Hong
.
IEEE TRANSACTIONS ON FUZZY SYSTEMS,
2021, 29 (06)
:1629-1639

An, Liwei
论文数: 0 引用数: 0
h-index: 0
机构:
Northeastern Univ, Coll Informat Sci & Engn, Shenyang 110819, Peoples R China Northeastern Univ, Coll Informat Sci & Engn, Shenyang 110819, Peoples R China

Yang, Guang-Hong
论文数: 0 引用数: 0
h-index: 0
机构:
Northeastern Univ, Coll Informat Sci & Engn, Shenyang 110819, Peoples R China
Northeastern Univ, State Key Lab Synthet Automat Proc Ind, Shenyang 110819, Peoples R China Northeastern Univ, Coll Informat Sci & Engn, Shenyang 110819, Peoples R China
[3]
Opacity Enforcement for Confidential Robust Control in Linear Cyber-Physical Systems
[J].
An, Liwei
;
Yang, Guang-Hong
.
IEEE TRANSACTIONS ON AUTOMATIC CONTROL,
2020, 65 (03)
:1234-1241

An, Liwei
论文数: 0 引用数: 0
h-index: 0
机构:
Northeastern Univ, Coll Informat Sci & Engn, Shenyang 110819, Peoples R China Northeastern Univ, Coll Informat Sci & Engn, Shenyang 110819, Peoples R China

Yang, Guang-Hong
论文数: 0 引用数: 0
h-index: 0
机构:
Northeastern Univ, Coll Informat Sci & Engn, Shenyang 110819, Peoples R China
Northeastern Univ, State Key Lab Synthet Automat Proc Ind, Shenyang 110819, Peoples R China Northeastern Univ, Coll Informat Sci & Engn, Shenyang 110819, Peoples R China
[4]
Data-Driven Coordinated Attack Policy Design Based on Adaptive L2-Gain Optimal Theory
[J].
An, Liwei
;
Yang, Guang-Hong
.
IEEE TRANSACTIONS ON AUTOMATIC CONTROL,
2018, 63 (06)
:1850-1857

An, Liwei
论文数: 0 引用数: 0
h-index: 0
机构:
Northeastern Univ, Coll Informat Sci & Engn, Shenyang 110819, Liaoning, Peoples R China Northeastern Univ, Coll Informat Sci & Engn, Shenyang 110819, Liaoning, Peoples R China

Yang, Guang-Hong
论文数: 0 引用数: 0
h-index: 0
机构:
Northeastern Univ, Coll Informat Sci & Engn, Shenyang 110819, Liaoning, Peoples R China
Northeastern Univ, State Key Lab Synthet Automat Proc Ind, Shenyang 110819, Liaoning, Peoples R China Northeastern Univ, Coll Informat Sci & Engn, Shenyang 110819, Liaoning, Peoples R China
[5]
An adaptive tracking control method with swing suppression for 4-DOF tower crane systems
[J].
Chen, He
;
Fang, Yongchun
;
Sun, Ning
.
MECHANICAL SYSTEMS AND SIGNAL PROCESSING,
2019, 123
:426-442

Chen, He
论文数: 0 引用数: 0
h-index: 0
机构:
Hebei Univ Technol, Sch Artificial Intelligence, Tianjin 300401, Peoples R China Hebei Univ Technol, Sch Artificial Intelligence, Tianjin 300401, Peoples R China

Fang, Yongchun
论文数: 0 引用数: 0
h-index: 0
机构:
Nankai Univ, Inst Robot & Automat Informat Syst, Tianjin 300350, Peoples R China
Nankai Univ, Tianjin Key Lab Intelligent Robot, Tianjin 300350, Peoples R China Hebei Univ Technol, Sch Artificial Intelligence, Tianjin 300401, Peoples R China

Sun, Ning
论文数: 0 引用数: 0
h-index: 0
机构:
Nankai Univ, Inst Robot & Automat Informat Syst, Tianjin 300350, Peoples R China
Nankai Univ, Tianjin Key Lab Intelligent Robot, Tianjin 300350, Peoples R China Hebei Univ Technol, Sch Artificial Intelligence, Tianjin 300401, Peoples R China
[6]
Optimal trajectory planning and tracking control method for overhead cranes
[J].
Chen, He
;
Fang, Yongchun
;
Sun, Ning
.
IET CONTROL THEORY AND APPLICATIONS,
2016, 10 (06)
:692-699

Chen, He
论文数: 0 引用数: 0
h-index: 0
机构:
Nankai Univ, Inst Robot & Automat Informat Syst, Tianjin 300353, Peoples R China
Nankai Univ, Tianjin Key Lab Intelligent Robot, Tianjin 300353, Peoples R China Nankai Univ, Inst Robot & Automat Informat Syst, Tianjin 300353, Peoples R China

Fang, Yongchun
论文数: 0 引用数: 0
h-index: 0
机构:
Nankai Univ, Inst Robot & Automat Informat Syst, Tianjin 300353, Peoples R China
Nankai Univ, Tianjin Key Lab Intelligent Robot, Tianjin 300353, Peoples R China Nankai Univ, Inst Robot & Automat Informat Syst, Tianjin 300353, Peoples R China

Sun, Ning
论文数: 0 引用数: 0
h-index: 0
机构:
Nankai Univ, Inst Robot & Automat Informat Syst, Tianjin 300353, Peoples R China
Nankai Univ, Tianjin Key Lab Intelligent Robot, Tianjin 300353, Peoples R China Nankai Univ, Inst Robot & Automat Informat Syst, Tianjin 300353, Peoples R China
[7]
Adaptive tracking control of uncertain MIMO nonlinear systems with input constraints
[J].
Chen, Mou
;
Ge, Shuzhi Sam
;
Ren, Beibei
.
AUTOMATICA,
2011, 47 (03)
:452-465

Chen, Mou
论文数: 0 引用数: 0
h-index: 0
机构:
Nanjing Univ Aeronaut & Astronaut, Coll Automat Engn, Nanjing 210016, Peoples R China
Natl Univ Singapore, Dept Elect & Comp Engn, Singapore 117576, Singapore Univ Elect Sci & Technol China, Inst Intelligent Syst & Informat Technol, Chengdu 611731, Peoples R China

Ge, Shuzhi Sam
论文数: 0 引用数: 0
h-index: 0
机构:
Univ Elect Sci & Technol China, Inst Intelligent Syst & Informat Technol, Chengdu 611731, Peoples R China
Univ Elect Sci & Technol China, Inst Robot, Chengdu 611731, Peoples R China
Natl Univ Singapore, Dept Elect & Comp Engn, Singapore 117576, Singapore Univ Elect Sci & Technol China, Inst Intelligent Syst & Informat Technol, Chengdu 611731, Peoples R China

Ren, Beibei
论文数: 0 引用数: 0
h-index: 0
机构:
Natl Univ Singapore, Dept Elect & Comp Engn, Singapore 117576, Singapore Univ Elect Sci & Technol China, Inst Intelligent Syst & Informat Technol, Chengdu 611731, Peoples R China
[8]
Reinforcement Q-learning based on Multirate Generalized Policy Iteration and Its Application to a 2-DOF Helicopter
[J].
Chun, Tae Yoon
;
Park, Jin Bae
;
Choi, Yoon Ho
.
INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS,
2018, 16 (01)
:377-386

Chun, Tae Yoon
论文数: 0 引用数: 0
h-index: 0
机构:
Yonsei Univ, Sch Elect & Elect Engn, 50 Yonsei Ro, Seoul, South Korea Yonsei Univ, Sch Elect & Elect Engn, 50 Yonsei Ro, Seoul, South Korea

Park, Jin Bae
论文数: 0 引用数: 0
h-index: 0
机构:
Yonsei Univ, Sch Elect & Elect Engn, 50 Yonsei Ro, Seoul, South Korea Yonsei Univ, Sch Elect & Elect Engn, 50 Yonsei Ro, Seoul, South Korea

Choi, Yoon Ho
论文数: 0 引用数: 0
h-index: 0
机构:
Kyonggi Univ, Dept Elect Engn, 94-6 Yiui Dong, Suwon, Kyonggi Do, South Korea Yonsei Univ, Sch Elect & Elect Engn, 50 Yonsei Ro, Seoul, South Korea
[9]
Distributed Adaptive Tracking Control for High-Order Nonlinear Multiagent Systems Over Event-Triggered Communication
[J].
Deng, Chao
;
Wen, Changyun
;
Wang, Wei
;
Li, Xinyao
;
Yue, Dong
.
IEEE TRANSACTIONS ON AUTOMATIC CONTROL,
2023, 68 (02)
:1176-1183

Deng, Chao
论文数: 0 引用数: 0
h-index: 0
机构:
Nanjing Univ Posts & Telecommun, Inst Adv Technol, Nanjing 210023, Peoples R China Nanjing Univ Posts & Telecommun, Inst Adv Technol, Nanjing 210023, Peoples R China

Wen, Changyun
论文数: 0 引用数: 0
h-index: 0
机构:
Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore 639798, Singapore Nanjing Univ Posts & Telecommun, Inst Adv Technol, Nanjing 210023, Peoples R China

Wang, Wei
论文数: 0 引用数: 0
h-index: 0
机构:
Beihang Univ, Sch Automat Sci & Elect Engn, Beijing 100191, Peoples R China
Beihang Univ, State Key Lab Software Dev Environm, Beijing 100191, Peoples R China Nanjing Univ Posts & Telecommun, Inst Adv Technol, Nanjing 210023, Peoples R China

Li, Xinyao
论文数: 0 引用数: 0
h-index: 0
机构:
Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore 639798, Singapore Nanjing Univ Posts & Telecommun, Inst Adv Technol, Nanjing 210023, Peoples R China

Yue, Dong
论文数: 0 引用数: 0
h-index: 0
机构:
Nanjing Univ Posts & Telecommun, Inst Adv Technol, Nanjing 210023, Peoples R China Nanjing Univ Posts & Telecommun, Inst Adv Technol, Nanjing 210023, Peoples R China
[10]
Distributed Observer-Based Cooperative Control Approach for Uncertain Nonlinear MASs Under Event-Triggered Communication
[J].
Deng, Chao
;
Wen, Changyun
;
Huang, Jiangshuai
;
Zhang, Xian-Ming
;
Zou, Ying
.
IEEE TRANSACTIONS ON AUTOMATIC CONTROL,
2022, 67 (05)
:2669-2676

Deng, Chao
论文数: 0 引用数: 0
h-index: 0
机构:
Nanjing Univ Posts & Telecommun, Inst Adv Technol, Nanjing 210023, Peoples R China Nanjing Univ Posts & Telecommun, Inst Adv Technol, Nanjing 210023, Peoples R China

Wen, Changyun
论文数: 0 引用数: 0
h-index: 0
机构:
Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore 639798, Singapore Nanjing Univ Posts & Telecommun, Inst Adv Technol, Nanjing 210023, Peoples R China

Huang, Jiangshuai
论文数: 0 引用数: 0
h-index: 0
机构:
Chongqing Univ, Sch Automat, Chongqing 400044, Peoples R China Nanjing Univ Posts & Telecommun, Inst Adv Technol, Nanjing 210023, Peoples R China

Zhang, Xian-Ming
论文数: 0 引用数: 0
h-index: 0
机构:
Swinburne Univ Technol, Sch Software & Elect Engn, Melbourne, Vic 3122, Australia Nanjing Univ Posts & Telecommun, Inst Adv Technol, Nanjing 210023, Peoples R China

Zou, Ying
论文数: 0 引用数: 0
h-index: 0
机构:
Hunan Univ Sci & Technol, Sch Informat & Elect Engn, Xiangtan 411201, Peoples R China Nanjing Univ Posts & Telecommun, Inst Adv Technol, Nanjing 210023, Peoples R China