Online Policy Learning-Based Output-Feedback Optimal Control of Continuous-Time Systems

被引:36
作者
Zhao, Jun [1 ]
Lv, Yongfeng [2 ]
Zeng, Qingliang [1 ,3 ]
Wan, Lirong [1 ]
机构
[1] Shandong Univ Sci & Technol, Coll Mech & Elect Engn, Qingdao 266590, Peoples R China
[2] Taiyuan Univ Technol, Coll Elect & Power Engn, Taiyuan 030024, Peoples R China
[3] Shandong Normal Univ, Dept Informat Sci & Engn, Jinan 250358, Peoples R China
基金
中国国家自然科学基金;
关键词
Optimal control; Mathematical models; Heuristic algorithms; Atmospheric modeling; Riccati equations; Observers; Convergence; Output-feedback control; optimal control; policy learning; continuous-time systems; LINEAR-SYSTEMS;
D O I
10.1109/TCSII.2022.3211832
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Although state-feedback optimal control of the continuous-time (CT) systems has been extensively studied, resolving optimal control online via output-feedback is still challenging, especially only input-output information can be used. In this brief, we develop an innovative technique to online design the output-feedback optimal control (OFOC) of the CT systems. Firstly, to synthesis the OFOC, an output-feedback algebraic Riccati equation (OARE) is constructed, which can be solved using input-output information. Then, an online policy learning (PL) algorithm is developed to compute the solution of the OARE, where only the input-output information is required and the conventional offline learning procedure is avoided. Simulations based on an aircraft model are provided to test the developed control method and online learning algorithm.
引用
收藏
页码:652 / 656
页数:5
相关论文
共 19 条
[1]   Adaptive dynamic programming and optimal control of nonlinear nonaffine systems [J].
Bian, Tao ;
Jiang, Yu ;
Jiang, Zhong-Ping .
AUTOMATICA, 2014, 50 (10) :2624-2632
[2]   Necessary and sufficient conditions for H-∞ static output-feedback control [J].
Gadewadikar, Jyotirmay ;
Lewis, Frank L. ;
Abu-Khalaf, Murad .
JOURNAL OF GUIDANCE CONTROL AND DYNAMICS, 2006, 29 (04) :915-920
[3]   Online policy iterative-based H∞ optimization algorithm for a class of nonlinear systems [J].
He, Shuping ;
Fang, Haiyang ;
Zhang, Maoguang ;
Liu, Fei ;
Luan, Xiaoli ;
Ding, Zhengdao .
INFORMATION SCIENCES, 2019, 495 :1-13
[4]   Adaptive Optimal Control for a Class of Nonlinear Systems: The Online Policy Iteration Approach [J].
He, Shuping ;
Fang, Haiyang ;
Zhang, Maoguang ;
Liu, Fei ;
Ding, Zhengtao .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (02) :549-558
[5]   Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics [J].
Jiang, Yu ;
Jiang, Zhong-Ping .
AUTOMATICA, 2012, 48 (10) :2699-2704
[6]   Online adaptive approximate optimal tracking control with simplified dual approximation structure for continuous-time unknown nonlinear systems [J].
Na, Jing ;
Herrmann, Guido .
IEEE/CAA Journal of Automatica Sinica, 2014, 1 (04) :412-422
[7]   Reinforcement Learning for Partially Observable Dynamic Processes: Adaptive Dynamic Programming Using Measured Output Data [J].
Lewis, F. L. ;
Vamvoudakis, Kyriakos G. .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2011, 41 (01) :14-25
[8]  
Lewis F.L., 2012, Optimal Control
[9]   Observer-Based Neuro-Adaptive Optimized Control of Strict-Feedback Nonlinear Systems With State Constraints [J].
Li, Yongming ;
Liu, Yanjun ;
Tong, Shaocheng .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (07) :3131-3145
[10]   Adaptive Neural Networks Finite-Time Optimal Control for a Class of Nonlinear Systems [J].
Li, Yongming ;
Yang, Tingting ;
Tong, Shaocheng .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (11) :4451-4460