Neural-Network-Based Optimal Control for Discrete-Time Nonlinear Systems Using General Value Iteration

被引:0
作者
Li Hongliang [1 ]
Liu Derong [1 ]
Wang Ding [1 ]
机构
[1] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China
来源
PROCEEDINGS OF THE 31ST CHINESE CONTROL CONFERENCE | 2012年
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
Adaptive dynamic programming; approximate dynamic programming; optimal control; value iteration; neural networks; reinforcement learning;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we propose a novel adaptive dynamic programming (ADP) scheme based on general value iteration to obtain near optimal control for discrete-time nonlinear systems with continuous state and control space. First, the selection of initial value function is different from the traditional value iteration, and a new method is introduced to demonstrate the convergence property and convergence speed of the value function. Then, the control law obtained at each iteration can stabilize the system under some conditions. At last, three neural networks with Levenberg-Marquardt training algorithm are used to approximate the unknown nonlinear system, the value function and the optimal control law. One simulation example is presented to demonstrate the effectiveness of the present scheme.
引用
收藏
页码:2932 / 2937
页数:6
相关论文
共 14 条
[1]   Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach [J].
Abu-Khalaf, M ;
Lewis, FL .
AUTOMATICA, 2005, 41 (05) :779-791
[2]   Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof [J].
Al-Tamimi, Asma ;
Lewis, Frank L. ;
Abu-Khalaf, Murad .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2008, 38 (04) :943-949
[3]  
[Anonymous], 2004, INFORM MATH MODELLIN
[4]   Optimal control of unknown affine nonlinear discrete-time systems using offline-trained neural networks with proof of convergence [J].
Dierks, Travis ;
Thumati, Balaje T. ;
Jagannathan, S. .
NEURAL NETWORKS, 2009, 22 (5-6) :851-860
[5]   Adaptive Learning and Control for MIMO System Based on Adaptive Dynamic Programming [J].
Fu, Jian ;
He, Haibo ;
Zhou, Xinmin .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2011, 22 (07) :1133-1148
[6]  
Lewis F., 1995, Optimal control
[7]   Relaxing dynamic programming [J].
Lincoln, Bo ;
Rantzer, Anders .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2006, 51 (08) :1249-1260
[8]   Adaptive dynamic programming [J].
Murray, JJ ;
Cox, CJ ;
Lendaris, GG ;
Saeks, R .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2002, 32 (02) :140-153
[9]   Feasibility and stability of constrained finite receding horizon control [J].
Primbs, JA ;
Nevistic, V .
AUTOMATICA, 2000, 36 (07) :965-971
[10]   Adaptive critic designs [J].
Prokhorov, DV ;
Wunsch, DC .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1997, 8 (05) :997-1007