A Data-driven ADP with RBF Network and LSM Learning Algorithm

被引：0

作者：

Huang, Zhijian ^{[1
,2
]}

Li, Yudong ^{[1
]}

Chen, Wentao ^{[1
]}

Zhang, Qin ^{[1
]}

Wu, Qili ^{[1
]}

Tan, Qinmin ^{[1
]}

Yang, Zhiyuan ^{[1
]}

机构：

[1] Shanghai Maritime Univ, Merchant Marine Coll, Shanghai 201306, Peoples R China

[2] Shanghai Jiao Tong Univ, Inst Power Plant & Automat, Shanghai 200030, Peoples R China

来源：

2017 6TH DATA DRIVEN CONTROL AND LEARNING SYSTEMS (DDCLS) | 2017年

关键词：

Nonlinear; Approximate dynamic programming; RBF; LSM; Utility function; APPROXIMATE; SYSTEM;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

ADP is an effective optimal method. However, the optimality depends on its network structure and training algorithm. This paper adopts RBF neural network to realize its critic and action networks after a detailed analysis on ADP. The LSM method is introduced as training algorithm, and a novel basis function is defined, which achieves global optimization and online control. The validity is verified by finding the optimal point through local minimums.

引用

页码：346 / 349

页数：4

共 9 条

[1] A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems [J].

Bhasin, S. ;

Kamalapurkar, R. ;

Johnson, M. ;

Vamvoudakis, K. G. ;

Lewis, F. L. ;

Dixon, W. E. .

AUTOMATICA, 2013, 49 (01) :82-92

[2]

Govindhasamy J. J., 2005, IEE CONTROL ENG BOOK, V70, P293

[3] Approximate dynamic programming based supplementary reactive power control for DFIG wind farm to enhance power system stability [J].

Guo, Wentao ;

Liu, Feng ;

Si, Jennie ;

He, Dawei ;

Harley, Ronald ;

Mei, Shengwei .

NEUROCOMPUTING, 2015, 170 :417-427

[4] Global optimality of approximate dynamic programming and its use in non-convex function minimization [J].

Heydari, Ali ;

Balakrishnan, S. N. .

APPLIED SOFT COMPUTING, 2014, 24 :291-303

[5] An approximate dynamic programming method for multi-input multi-output nonlinear system [J].

Huang, Zhijian ;

Ma, Jie ;

Huang, He .

OPTIMAL CONTROL APPLICATIONS & METHODS, 2013, 34 (01) :80-95

[6] The derivation of iterative convergence calculation for a nonlinear MIMO approximate dynamic programming approach [J].

Huang, Zhijian ;

Ma, Jie ;

Huang, He .

APPLIED MATHEMATICS AND COMPUTATION, 2013, 219 (09) :4495-4502

[7] Approximate dynamic programming-based approaches for input-output data-driven control of nonlinear processes [J].

Lee, JM ;

Lee, JH .

AUTOMATICA, 2005, 41 (07) :1281-1288

[8] Risk aversion and adaptive management: Insights from a multi-armed bandit model of invasive species risk [J].

Springborn, Michael R. .

JOURNAL OF ENVIRONMENTAL ECONOMICS AND MANAGEMENT, 2014, 68 (02) :226-242

[9] Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations [J].

Vamvoudakis, Kyriakos G. ;

Lewis, Frank L. .

AUTOMATICA, 2011, 47 (08) :1556-1569

← 1 →