The intelligent critic framework for advanced optimal control

被引:136
作者
Wang, Ding [1 ,2 ,3 ,4 ]
Ha, Mingming [5 ]
Zhao, Mingming [1 ,2 ,3 ,4 ]
机构
[1] Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
[2] Beijing Univ Technol, Beijing Key Lab Computat Intelligence & Intellige, Beijing 100124, Peoples R China
[3] Beijing Univ Technol, Beijing Inst Artificial Intelligence, Beijing 100124, Peoples R China
[4] Beijing Univ Technol, Beijing Lab Smart Environm Protect, Beijing 100124, Peoples R China
[5] Univ Sci & Technol Beijing, Sch Automat & Elect Engn, Beijing 100083, Peoples R China
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
Advanced optimal control; Dynamic systems; Intelligent critic; HORIZON OPTIMAL-CONTROL; TIME NONLINEAR-SYSTEMS; OPTIMAL TRACKING CONTROL; VALUE-ITERATION; FEEDBACK-CONTROL; ROBUST-CONTROL; ALGORITHMS; ADP; MODELS; GAME;
D O I
10.1007/s10462-021-10118-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The idea of optimization can be regarded as an important basis of many disciplines and hence is extremely useful for a large number of research fields, particularly for artificial-intelligence-based advanced control design. Due to the difficulty of solving optimal control problems for general nonlinear systems, it is necessary to establish a kind of novel learning strategies with intelligent components. Besides, the rapid development of computer and networked techniques promotes the research on optimal control within discrete-time domain. In this paper, the bases, the derivation, and recent progresses of critic intelligence for discrete-time advanced optimal control design are presented with an emphasis on the iterative framework. Among them, the so-called critic intelligence methodology is highlighted, which integrates learning approximators and the reinforcement formulation.
引用
收藏
页码:1 / 22
页数:22
相关论文
共 106 条
[1]   Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach [J].
Abu-Khalaf, M ;
Lewis, FL .
AUTOMATICA, 2005, 41 (05) :779-791
[2]   Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof [J].
Al-Tamimi, Asma ;
Lewis, Frank L. ;
Abu-Khalaf, Murad .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2008, 38 (04) :943-949
[3]  
[Anonymous], 1996, Neuro-dynamic Programming
[4]   Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation [J].
Beard, RW ;
Saridis, GN ;
Wen, JT .
AUTOMATICA, 1997, 33 (12) :2159-2177
[5]  
Bellman RE., 1957, DYN PROGR
[6]   Feature-Based Aggregation and Deep Reinforcement Learning: A Survey and Some New Implementations [J].
Bertsekas, Dimitri P. .
IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2019, 6 (01) :1-31
[7]   Value and Policy Iterations in Optimal Control and Adaptive Dynamic Programming [J].
Bertsekas, Dimitri P. .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (03) :500-509
[8]   Value iteration and adaptive dynamic programming for data-driven adaptive optimal control design [J].
Bian, Tao ;
Jiang, Zhong-Ping .
AUTOMATICA, 2016, 71 :348-360
[9]   Optimal control of unknown affine nonlinear discrete-time systems using offline-trained neural networks with proof of convergence [J].
Dierks, Travis ;
Thumati, Balaje T. ;
Jagannathan, S. .
NEURAL NETWORKS, 2009, 22 (5-6) :851-860
[10]   Adaptive Event-Triggered Control Based on Heuristic Dynamic Programming for Nonlinear Discrete-Time Systems [J].
Dong, Lu ;
Zhong, Xiangnan ;
Sun, Changyin ;
He, Haibo .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (07) :1594-1605