Data-Based Optimal Tracking Control of Nonaffine Nonlinear Discrete-Time Systems

被引:0
作者
Luo, Biao [1 ]
Liu, Derong [2 ]
Huang, Tingwen [3 ]
Li, Chao [1 ]
机构
[1] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China
[2] Univ Sci & Technol Beijing, Sch Automat & Elect Engn, Beijing 100083, Peoples R China
[3] Texas A&M Univ Qatar, POB 23874, Doha, Qatar
来源
NEURAL INFORMATION PROCESSING, ICONIP 2016, PT IV | 2016年 / 9950卷
关键词
Optimal tracking control; Data-based; Q-learning; Critic-only; H-INFINITY CONTROL; LINEAR-SYSTEMS; CONTROL SCHEME; APPROXIMATION; ITERATION;
D O I
10.1007/978-3-319-46681-1_68
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The optimal tracking control problem of nonaffine nonlinear discrete-time systems is considered in this paper. The problem relies on the solution of the so-called tracking Hamilton-Jacobi-Bellman equation, which is extremely difficult to be solved even for simple systems. To overcome this difficulty, the data-based Q-learning algorithm is proposed by learning the optimal tracking control policy from data of the practical system. For its implementation purpose, the critic-only neural network structure is developed, where only critic neural network is required to estimate the Q-function and the least-square scheme is employed to update the weight of neural network.
引用
收藏
页码:573 / 581
页数:9
相关论文
共 26 条
[1]  
[Anonymous], IEEE T NEURAL NETW L
[2]  
[Anonymous], 2013, REINFORCEMENT LEARNI
[3]   Reinforcement learning in feedback control Challenges and benchmarks from technical process control [J].
Hafner, Roland ;
Riedmiller, Martin .
MACHINE LEARNING, 2011, 84 (1-2) :137-169
[4]   UNIVERSAL APPROXIMATION OF AN UNKNOWN MAPPING AND ITS DERIVATIVES USING MULTILAYER FEEDFORWARD NETWORKS [J].
HORNIK, K ;
STINCHCOMBE, M ;
WHITE, H .
NEURAL NETWORKS, 1990, 3 (05) :551-560
[5]   Approximate optimal trajectory tracking for continuous-time nonlinear systems [J].
Kamalapurkar, Rushikesh ;
Dinh, Huyen ;
Bhasin, Shubhendu ;
Dixon, Warren E. .
AUTOMATICA, 2015, 51 :40-48
[6]   Optimal Tracking Control of Unknown Discrete-Time Linear Systems Using Input-Output Measured Data [J].
Kiumarsi, Bahare ;
Lewis, Frank L. ;
Naghibi-Sistani, Mohammad-Bagher ;
Karimpour, Ali .
IEEE TRANSACTIONS ON CYBERNETICS, 2015, 45 (12) :2770-2779
[7]   Actor-Critic-Based Optimal Tracking for Partially Unknown Nonlinear Discrete-Time Systems [J].
Kiumarsi, Bahare ;
Lewis, Frank L. .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, 26 (01) :140-151
[8]   Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics [J].
Kiumarsi, Bahare ;
Lewis, Frank L. ;
Modares, Hamidreza ;
Karimpour, Ali ;
Naghibi-Sistani, Mohammad-Bagher .
AUTOMATICA, 2014, 50 (04) :1167-1175
[9]   Adaptive optimal control for a class of continuous-time affine nonlinear systems with unknown internal dynamics [J].
Liu, Derong ;
Yang, Xiong ;
Li, Hongliang .
NEURAL COMPUTING & APPLICATIONS, 2013, 23 (7-8) :1843-1850
[10]   Reinforcement Learning Design-Based Adaptive Tracking Control With Less Learning Parameters for Nonlinear Discrete-Time MIMO Systems [J].
Liu, Yan-Jun ;
Tang, Li ;
Tong, Shaocheng ;
Chen, C. L. Philip ;
Li, Dong-Juan .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, 26 (01) :165-176