Optimal Control Using IsoCost-Based Dynamic Programming

被引:1
作者
Alvankarian, Fatemeh [1 ]
Kalhor, Ahmad [1 ]
Masouleh, Mehdi Tale [1 ]
机构
[1] Univ Tehran, Sch Elect & Comp Engn, Human & Robot Interact Lab, Tehran, Iran
关键词
dynamic programming; learning systems; optimal control; HJB EQUATIONS; VISCOSITY SOLUTIONS; STATE CONSTRAINTS; APPROXIMATION;
D O I
10.1049/cth2.70014
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, a novel data-driven optimal control method based on reinforcement learning concepts is introduced. The proposed algorithm performs as a workaround to solving the Hamilton-Jacobi-Bellman equation. The main concept behind the proposed algorithm is the so-called IsoCost hypersurface (ICHS), which is a hypersurface in the state space of the system formed by points from which a specific amount of cost is spent by the control strategy in order to asymptotically stabilize the system. The fact that the control strategy requires to spend equal costs in order to stabilize all points on an ICHS is the reason for the naming of the IsoCost concept. Additional assumptions and definitions are mentioned before providing the theory of ICHS optimality. This theory proves, by contradiction, that the ICHS corresponding to the optimal control policy surrounds the ICHSs corresponding to other non-optimal control solutions. This paves the path to finding the optimal control solution using dynamic programming. The proposed method is implemented on the linear, fixed-base inverted pendulum, cart-pole and torsional pendulum bar system models and the results are compared with that of literature. The performance of this method in terms of cost, settling time and computation time is shown using numeric and illustrative comparisons.
引用
收藏
页数:13
相关论文
共 38 条
[1]   Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach [J].
Abu-Khalaf, M ;
Lewis, FL .
AUTOMATICA, 2005, 41 (05) :779-791
[2]   Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof [J].
Al-Tamimi, Asma ;
Lewis, Frank L. ;
Abu-Khalaf, Murad .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2008, 38 (04) :943-949
[3]  
Alvankarian Fatemeh, 2023, 2023 31st International Conference on Electrical Engineering (ICEE), P595, DOI 10.1109/ICEE59167.2023.10334855
[4]  
Anderson B.D.O., 2007, Optimal Control: Linear Quadratic Methods
[5]  
Azmi B, 2021, J MACH LEARN RES, V22
[6]  
Bardi Martino, 1997, Systems & Control: Foundations & Applications, V12, DOI DOI 10.1007/978-0-8176-4755-1
[7]   The data-driven approach to classical control theory [J].
Bazanella, Alexandre Sanfelici ;
Campestrini, Luciola ;
Eckhard, Diego .
ANNUAL REVIEWS IN CONTROL, 2023, 56
[8]   ON VISCOSITY SOLUTION OF HJB EQUATIONS WITH STATE CONSTRAINTS AND REFLECTION CONTROL [J].
Biswas, Anup ;
Ishii, Hitoshi ;
Saha, Subhamay ;
Wang, Lin .
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2017, 55 (01) :365-396
[9]  
Cheung H, 2023, Arxiv, DOI arXiv:2310.14446
[10]   VISCOSITY SOLUTIONS OF HAMILTON-JACOBI EQUATIONS IN INFINITE DIMENSIONS .7. THE HJB EQUATION IS NOT ALWAYS SATISFIED [J].
CRANDALL, MG ;
LIONS, PL .
JOURNAL OF FUNCTIONAL ANALYSIS, 1994, 125 (01) :111-148