Integrated adaptive dynamic programming for data -driven optimal controller design

被引:10
作者
Li, Guoqiang [1 ]
Goerges, Daniel [1 ]
Mu, Chaoxu [2 ]
机构
[1] Univ Kaiserslautern, Dept Elect & Comp Engn, Erwin Schrodinger Str 12, D-67663 Kaiserslautern, Germany
[2] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China
关键词
TIME NONLINEAR-SYSTEMS; TRACKING CONTROL; SCHEME;
D O I
10.1016/j.neucom.2020.04.095
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper a novel integrated adaptive dynamic programming method with an advantage function is developed to solve model-free optimal control problems and improve the control performance. The advantage function is utilized to evaluate the cost resulting from the action (control variables) which does not follow the optimal control policy. The Q function in Q-learning can thus be built from a value function and the advantage function. The control policy is then improved through minimizing the Q function. To employ the proposed algorithm, an integrated multi-layer neural network (INN) is designed for the value function and the control variables. Only one single neural network requires adaption. This avoids the iterative learning of two separate networks in the heuristic dynamic programming-based methods. Simulation for linear and non-linear optimal control problems is studied. Comparing to the optimal solutions resulting from the linear quadratic regulator and dynamic programming (DP), the proposed INN design can lead to closer control performance than the ones with action dependent heuristic dynamic programming (ADHDP). Furthermore INN is applied to optimize the energy management strategy of hybrid electric vehicles for fuel economy. The fuel consumption based on INN is lower than the one from ADHDP and much closer to the optimal results by DP. The result indicates the near fuel-optimality and an effective practical application. © 2020 Elsevier B.V.
引用
收藏
页码:143 / 152
页数:10
相关论文
共 38 条
[1]   Nearly optimal state feedback control of constrained nonlinear systems using a neural networks HJB approach [J].
Abu-Khalaf, M ;
Lewis, FL .
ANNUAL REVIEWS IN CONTROL, 2004, 28 (02) :239-251
[2]   Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof [J].
Al-Tamimi, Asma ;
Lewis, Frank L. ;
Abu-Khalaf, Murad .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2008, 38 (04) :943-949
[3]   Reinforcement Learning-Based Adaptive Optimal Exponential Tracking Control of Linear Systems With Unknown Dynamics [J].
Chen, Ci ;
Modares, Hamidreza ;
Xie, Kan ;
Lewis, Frank L. ;
Wan, Yan ;
Xie, Shengli .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2019, 64 (11) :4423-4438
[4]   Adaptive Actor-Critic Design-Based Integral Sliding-Mode Control for Partially Unknown Nonlinear Systems With Input Disturbances [J].
Fan, Quan-Yong ;
Yang, Guang-Hong .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2016, 27 (01) :165-177
[5]   An Introduction to Deep Reinforcement Learning [J].
Francois-Lavet, Vincent ;
Henderson, Peter ;
Islam, Riashat ;
Bellemare, Marc G. ;
Pineau, Joelle .
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2018, 11 (3-4) :219-354
[6]  
Gu SX, 2016, PR MACH LEARN RES, V48
[7]  
Harmon M. E., 1995, Advances in Neural Information Processing Systems 7, P353
[8]   A three-network architecture for on-line learning and optimization based on adaptive dynamic programming [J].
He, Haibo ;
Ni, Zhen ;
Fu, Jian .
NEUROCOMPUTING, 2012, 78 (01) :3-13
[9]   An adaptive critic-based scheme for consensus control of nonlinear multi-agent systems [J].
Heydari, Ali ;
Balakrishnan, S. N. .
INTERNATIONAL JOURNAL OF CONTROL, 2014, 87 (12) :2463-2474
[10]   From model-based control to data-driven control: Survey, classification and perspective [J].
Hou, Zhong-Sheng ;
Wang, Zhuo .
INFORMATION SCIENCES, 2013, 235 :3-35