The intelligent critic framework for advanced optimal control

被引：136

作者：

Wang, Ding ^{[1
,2
,3
,4
]}

Ha, Mingming ^{[5
]}

Zhao, Mingming ^{[1
,2
,3
,4
]}

机构：

[1] Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China

[2] Beijing Univ Technol, Beijing Key Lab Computat Intelligence & Intellige, Beijing 100124, Peoples R China

[3] Beijing Univ Technol, Beijing Inst Artificial Intelligence, Beijing 100124, Peoples R China

[4] Beijing Univ Technol, Beijing Lab Smart Environm Protect, Beijing 100124, Peoples R China

[5] Univ Sci & Technol Beijing, Sch Automat & Elect Engn, Beijing 100083, Peoples R China

来源：

ARTIFICIAL INTELLIGENCE REVIEW | 2022年 / 55卷 / 01期

基金：

中国国家自然科学基金; 北京市自然科学基金;

关键词：

Advanced optimal control; Dynamic systems; Intelligent critic; HORIZON OPTIMAL-CONTROL; TIME NONLINEAR-SYSTEMS; OPTIMAL TRACKING CONTROL; VALUE-ITERATION; FEEDBACK-CONTROL; ROBUST-CONTROL; ALGORITHMS; ADP; MODELS; GAME;

D O I：

10.1007/s10462-021-10118-9

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The idea of optimization can be regarded as an important basis of many disciplines and hence is extremely useful for a large number of research fields, particularly for artificial-intelligence-based advanced control design. Due to the difficulty of solving optimal control problems for general nonlinear systems, it is necessary to establish a kind of novel learning strategies with intelligent components. Besides, the rapid development of computer and networked techniques promotes the research on optimal control within discrete-time domain. In this paper, the bases, the derivation, and recent progresses of critic intelligence for discrete-time advanced optimal control design are presented with an emphasis on the iterative framework. Among them, the so-called critic intelligence methodology is highlighted, which integrates learning approximators and the reinforcement formulation.

引用

页码：1 / 22

页数：22

共 106 条

[1] Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach [J].

Abu-Khalaf, M ;

Lewis, FL .

AUTOMATICA, 2005, 41 (05) :779-791

[2] Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof [J].

Al-Tamimi, Asma ;

Lewis, Frank L. ;

Abu-Khalaf, Murad .

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2008, 38 (04) :943-949

[3]

[Anonymous], 1996, Neuro-dynamic Programming

[4] Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation [J].

Beard, RW ;

Saridis, GN ;

Wen, JT .

AUTOMATICA, 1997, 33 (12) :2159-2177

[5]

Bellman RE., 1957, DYN PROGR

[6] Feature-Based Aggregation and Deep Reinforcement Learning: A Survey and Some New Implementations [J].

Bertsekas, Dimitri P. .

IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2019, 6 (01) :1-31

[7] Value and Policy Iterations in Optimal Control and Adaptive Dynamic Programming [J].

Bertsekas, Dimitri P. .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (03) :500-509

[8] Value iteration and adaptive dynamic programming for data-driven adaptive optimal control design [J].

Bian, Tao ;

Jiang, Zhong-Ping .

AUTOMATICA, 2016, 71 :348-360

[9] Optimal control of unknown affine nonlinear discrete-time systems using offline-trained neural networks with proof of convergence [J].

Dierks, Travis ;

Thumati, Balaje T. ;

Jagannathan, S. .

NEURAL NETWORKS, 2009, 22 (5-6) :851-860

[10] Adaptive Event-Triggered Control Based on Heuristic Dynamic Programming for Nonlinear Discrete-Time Systems [J].

Dong, Lu ;

Zhong, Xiangnan ;

Sun, Changyin ;

He, Haibo .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (07) :1594-1605

← 1 2 3 4 5 6 7 8 9 10 →