Integrated adaptive dynamic programming for data -driven optimal controller design

被引：10

作者：

Li, Guoqiang ^{[1
]}

Goerges, Daniel ^{[1
]}

Mu, Chaoxu ^{[2
]}

机构：

[1] Univ Kaiserslautern, Dept Elect & Comp Engn, Erwin Schrodinger Str 12, D-67663 Kaiserslautern, Germany

[2] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China

来源：

NEUROCOMPUTING | 2020年 / 403卷

关键词：

TIME NONLINEAR-SYSTEMS; TRACKING CONTROL; SCHEME;

D O I：

10.1016/j.neucom.2020.04.095

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper a novel integrated adaptive dynamic programming method with an advantage function is developed to solve model-free optimal control problems and improve the control performance. The advantage function is utilized to evaluate the cost resulting from the action (control variables) which does not follow the optimal control policy. The Q function in Q-learning can thus be built from a value function and the advantage function. The control policy is then improved through minimizing the Q function. To employ the proposed algorithm, an integrated multi-layer neural network (INN) is designed for the value function and the control variables. Only one single neural network requires adaption. This avoids the iterative learning of two separate networks in the heuristic dynamic programming-based methods. Simulation for linear and non-linear optimal control problems is studied. Comparing to the optimal solutions resulting from the linear quadratic regulator and dynamic programming (DP), the proposed INN design can lead to closer control performance than the ones with action dependent heuristic dynamic programming (ADHDP). Furthermore INN is applied to optimize the energy management strategy of hybrid electric vehicles for fuel economy. The fuel consumption based on INN is lower than the one from ADHDP and much closer to the optimal results by DP. The result indicates the near fuel-optimality and an effective practical application. © 2020 Elsevier B.V.

引用

页码：143 / 152

页数：10

共 38 条

[11]

Kiran B.R., ARXIV200200444

[12] Optimal and Autonomous Control Using Reinforcement Learning: A Survey [J].

Kiumarsi, Bahare ;

Vamvoudakis, Kyriakos G. ;

Modares, Hamidreza ;

Lewis, Frank L. .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (06) :2042-2062

[13] Reinforcement Learning and Adaptive Dynamic Programming for Feedback Control [J].

Lewis, Frank L. ;

Vrabie, Draguna .

IEEE CIRCUITS AND SYSTEMS MAGAZINE, 2009, 9 (03) :32-50

[14] Policy Iteration Adaptive Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems [J].

Liu, Derong ;

Wei, Qinglai .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2014, 25 (03) :621-634

[15] Neural-Network-Based Optimal Control for a Class of Unknown Discrete-Time Nonlinear Systems Using Globalized Dual Heuristic Programming [J].

Liu, Derong ;

Wang, Ding ;

Zhao, Dongbin ;

Wei, Qinglai ;

Jin, Ning .

IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2012, 9 (03) :628-634

[16] A boundedness result for the direct heuristic dynamic programming [J].

Liu, Feng ;

Sun, Jian ;

Si, Jennie ;

Guo, Wentao ;

Mei, Shengwei .

NEURAL NETWORKS, 2012, 32 :229-235

[17] Adaptive Q-Learning for Data-Based Optimal Output Regulation With Experience Replay [J].

Luo, Biao ;

Yang, Yin ;

Liu, Derong .

IEEE TRANSACTIONS ON CYBERNETICS, 2018, 48 (12) :3337-3348

[18] Model-Free Optimal Tracking Control via Critic-Only Q-Learning [J].

Luo, Biao ;

Liu, Derong ;

Huang, Tingwen ;

Wang, Ding .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2016, 27 (10) :2134-2144

[19] Human-level control through deep reinforcement learning [J].

Mnih, Volodymyr ;

Kavukcuoglu, Koray ;

Silver, David ;

Rusu, Andrei A. ;

Veness, Joel ;

Bellemare, Marc G. ;

Graves, Alex ;

Riedmiller, Martin ;

Fidjeland, Andreas K. ;

Ostrovski, Georg ;

Petersen, Stig ;

Beattie, Charles ;

Sadik, Amir ;

Antonoglou, Ioannis ;

King, Helen ;

Kumaran, Dharshan ;

Wierstra, Daan ;

Legg, Shane ;

Hassabis, Demis .

NATURE, 2015, 518 (7540) :529-533

[20] Novel iterative neural dynamic programming for data-based approximate optimal control design [J].

Mu, Chaoxu ;

Wang, Ding ;

He, Haibo .

AUTOMATICA, 2017, 81 :240-252

← 1 2 3 4 →