Adaptive critic control with multi-step policy evaluation for nonlinear zero-sum games

被引：2

作者：

Li, Xin ^{[1
,2
,3
,4
]}

Wang, Ding ^{[1
,2
,3
,4
,5
]}

Wang, Jiangyu ^{[1
,2
,3
,4
]}

Qiao, Junfei ^{[1
,2
,3
,4
]}

机构：

[1] Beijing Univ Technol, Fac Informat Technol, Beijing, Peoples R China

[2] Beijing Univ Technol, Beijing Key Lab Computat Intelligence & Intelligen, Beijing, Peoples R China

[3] Beijing Univ Technol, Beijing Inst Artificial Intelligence, Beijing, Peoples R China

[4] Beijing Univ Technol, Beijing Lab Smart Environm Protect, Beijing, Peoples R China

[5] Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China

来源：

INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL | 2024年 / 34卷 / 01期

基金：

北京市自然科学基金; 中国国家自然科学基金;

关键词：

adaptive critic control; multi-step policy evaluation; nonlinear plants; optimal control; zero-sum games; STABILITY ANALYSIS; VALUE-ITERATION; SYSTEMS; ALGORITHM; DESIGNS;

D O I：

10.1002/rnc.6984

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

To attenuate the effect of disturbances on control performance, a multi-step adaptive critic control (MsACC) framework is developed to solve zero-sum games for discrete-time nonlinear systems. The MsACC algorithm utilizes multi-step policy evaluation to obtain the solution of the Hamilton-Jacobi-Isaac equation, which is faster than that of the one-step policy evaluation. The convergence rate of the MsACC algorithm is adjustable by varying the step size of the policy evaluation. In addition, the stability and convergence of the MsACC algorithm are proved under certain conditions. In order to realize the MsACC algorithm, three neural networks are established to approximate the control input, the disturbance input, and the cost function, respectively. Finally, the effectiveness of the MsACC algorithm is verified by two simulation examples, including a linear system and a nonlinear plant.

引用

页码：551 / 566

页数：16

共 43 条

[1] Online Model-Free n-Step HDP With Stability Analysis [J].

Al Dabooni, Seaar ;

Wunsch, Donald C., II .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (04) :1255-1269

[2] Adaptive critic designs for discrete-time zero-sum games with application to H∞ control [J].

Al-Tamimi, Asma ;

Abu-Khalaf, Murad ;

Lewis, Frank L. .

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2007, 37 (01) :240-247

[3]

Alejandro G., 2021, CONTROL ENG PRACT, V111

[4] An Optimal Primary Frequency Control Based on Adaptive Dynamic Programming for Islanded Modernized Microgrids [J].

Davari, Masoud ;

Gao, Weinan ;

Jiang, Zhong-Ping ;

Lewis, Frank L. .

IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2021, 18 (03) :1109-1121

[5] Resilient reinforcement learning and robust output regulation under denial-of-service attacks [J].

Gao, Weinan ;

Deng, Chao ;

Jiang, Yi ;

Jiang, Zhong-Ping .

AUTOMATICA, 2022, 142

[6] Discounted Iterative Adaptive Critic Designs With Novel Stability Analysis for Tracking Control [J].

Ha, Mingming ;

Wang, Ding ;

Liu, Derong .

IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2022, 9 (07) :1262-1272

[7] A Novel Value Iteration Scheme With Adjustable Convergence Rate [J].

Ha, Mingming ;

Wang, Ding ;

Liu, Derong .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (10) :7430-7442

[8] Generalized value iteration for discounted optimal control with stability analysis [J].

Ha, Mingming ;

Wang, Ding ;

Liu, Derong .

SYSTEMS & CONTROL LETTERS, 2021, 147 (147)

[9] Stability Analysis of Optimal Adaptive Control Using Value Iteration With Approximation Errors [J].

Heydari, Ali .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2018, 63 (09) :3119-3126

[10] Stability Analysis of Optimal Adaptive Control Under Value Iteration Using a Stabilizing Initial Policy [J].

Heydari, Ali .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (09) :4522-4527

← 1 2 3 4 5 →