An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games

被引：338

作者：

Zhang, Huaguang ^{[1
]}

Wei, Qinglai ^{[2
]}

Liu, Derong ^{[2
]}

机构：

[1] Northeastern Univ, Sch Informat Sci & Engn, Shenyang 110004, Peoples R China

[2] Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China

来源：

AUTOMATICA | 2011年 / 47卷 / 01期

基金：

北京市自然科学基金; 中国国家自然科学基金; 国家高技术研究发展计划(863计划);

关键词：

Adaptive critic designs; Adaptive dynamic programming; Approximate dynamic programming; Neural network; Zero-sum differential games; SYSTEMS; EQUATION;

D O I：

10.1016/j.automatica.2010.10.033

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper, a new iterative adaptive dynamic programming (ADP) method is proposed to solve a class of continuous-time nonlinear two-person zero-sum differential games. The idea is to use the ADP technique to obtain the optimal control pair iteratively which makes the performance index function reach the saddle point of the zero-sum differential games. If the saddle point does not exist, the mixed optimal control pair is obtained to make the performance index function reach the mixed optimum. Stability analysis of the nonlinear systems is presented and the convergence property of the performance index function is also proved. Two simulation examples are given to illustrate the performance of the proposed method. (C) 2010 Elsevier Ltd. All rights reserved.

引用

页码：207 / 214

页数：8

共 22 条

[1] Policy iterations on the Hamilton-Jacobi-Isaacs equation for H∞ state feedback control with input saturation [J].

Abu-Khalaf, Murad ;

Lewis, Frank L. ;

Huang, Jie .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2006, 51 (12) :1989-1995

[2] Neurodynamic programming and zero-sum games for constrained control systems [J].

Abu-Khalaf, Murad ;

Lewis, Frank L. ;

Huang, Jie .

IEEE TRANSACTIONS ON NEURAL NETWORKS, 2008, 19 (07) :1243-1252

[3] Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control [J].

Al-Tamimi, Asma ;

Lewis, Frank L. ;

Abu-Khalaf, Murad .

AUTOMATICA, 2007, 43 (03) :473-481

[4]

[Anonymous], ICIC EXPRESS LETT

[5]

Bardi M., 1997, Optimal control and viscosity solutions of HamiltonJacobi-Bellman equations

[6]

Basar T, 1998, Dynamic Noncooperative Game Theory

[7]

Bernhard P., 1995, H-optimal control and related minimax design problems, V2nd

[8] Global H∞ controllers for a class of nonlinear systems [J].

Bianchini, G ;

Genesio, R ;

Parenti, A ;

Tesi, A .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2004, 49 (02) :244-249

[9] Adaptive Adversarial Multi-Armed Bandit Approach to Two-Person Zero-Sum Markov Games [J].

Chang, Hyeong Soo ;

Hu, Jiaqiao ;

Fu, Michael C. ;

Marcus, Steven I. .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2010, 55 (02) :463-468

[10]

Cloutier JR, 1997, P AMER CONTR CONF, P932, DOI 10.1109/ACC.1997.609663

← 1 2 3 →