Composite Observer-Based Optimal Attitude-Tracking Control With Reinforcement Learning for Hypersonic Vehicles

被引:30
作者
Zhao, Shangwei [1 ,2 ]
Wang, Jingcheng [1 ,2 ]
Xu, Haotian [3 ]
Wang, Bohui [4 ,5 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Automat, Minist Educ China, Shanghai 200240, Peoples R China
[2] Shanghai Jiao Tong Univ, Key Lab Syst Control & Informat Proc, Minist Educ China, Shanghai 200240, Peoples R China
[3] Shandong Univ, Sch Control Sci & Engn, Jinan 250061, Peoples R China
[4] Xi An Jiao Tong Univ, Sch Cyber Sci & Engn, Xian 710049, Peoples R China
[5] Xidian Univ, Sch Aerosp Sci & Technol, Xian 710071, Peoples R China
基金
中国国家自然科学基金;
关键词
Hypersonic vehicles; Nonlinear dynamical systems; Optimal control; Observers; Attitude control; Aerodynamics; Vehicle dynamics; Attitude-tracking control; near-optimal control; observer design; reinforcement learning (RL); ROBUST OPTIMAL-CONTROL; NONLINEAR-SYSTEMS; EXPERIENCE REPLAY; NEURAL-NETWORK; DESIGN;
D O I
10.1109/TCYB.2022.3192871
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This article proposes an observer-based reinforcement learning (RL) control approach to address the optimal attitude-tracking problem and application for hypersonic vehicles in the reentry phase. Due to the unknown uncertainty and nonlinearity caused by parameter perturbation and external disturbance, accurate model information of hypersonic vehicles in the reentry phase is generally unavailable. For this reason, a novel synchronous estimation is proposed to construct a composite observer for hypersonic vehicles, which consists of a neural-network (NN)-based Luenberger-type observer and a synchronous disturbance observer. This solves the identification problem of nonlinear dynamics in the reference control and realizes the estimation of the system state when unknown nonlinear dynamics and unknown disturbance exist at the same time. By synthesizing the information from the composite observer, an RL tracking controller is developed to solve the optimal attitude-tracking control problem. To improve the convergence performance of critic network weights, concurrent learning is employed to replace the traditional persistent excitation condition with a historical experience replay manner. In addition, this article proves that the weight estimation error is bounded when the learning rate satisfies the given sufficient condition. Finally, the numerical simulation demonstrates the effectiveness and superiority of the proposed approaches to attitude-tracking control systems for hypersonic vehicles.
引用
收藏
页码:913 / 926
页数:14
相关论文
共 51 条
  • [1] PID control system analysis, design, and technology
    Ang, KH
    Chong, G
    Li, Y
    [J]. IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, 2005, 13 (04) : 559 - 576
  • [2] The explicit linear quadratic regulator for constrained systems
    Bemporad, A
    Morari, M
    Dua, V
    Pistikopoulos, EN
    [J]. AUTOMATICA, 2002, 38 (01) : 3 - 20
  • [3] NECESSARY AND SUFFICIENT CONDITIONS FOR PARAMETER CONVERGENCE IN ADAPTIVE-CONTROL
    BOYD, S
    SASTRY, SS
    [J]. AUTOMATICA, 1986, 22 (06) : 629 - 639
  • [4] Six-DOF Spacecraft Optimal Trajectory Planning and Real-Time Attitude Control: A Deep Neural Network-Based Approach
    Chai, Runqi
    Tsourdos, Antonios
    Savvaris, Al
    Chai, Senchun
    Xia, Yuanqing
    Chen, C. L. Philip
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (11) : 5005 - 5013
  • [5] Concurrent Learning for Convergence in Adaptive Control without Persistency of Excitation
    Chowdhary, Girish
    Johnson, Eric
    [J]. 49TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2010, : 3674 - 3679
  • [6] Neural-Network-Based Output-Feedback Control Under Round-Robin Scheduling Protocols
    Ding, Derui
    Wang, Zidong
    Han, Qing-Long
    Wei, Guoliang
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2019, 49 (06) : 2372 - 2384
  • [7] Review of control and guidance technology on hypersonic vehicle
    Ding, Yibo
    Yue, Xiaokui
    Chen, Guangshan
    Si, Jiashun
    [J]. CHINESE JOURNAL OF AERONAUTICS, 2022, 35 (07) : 1 - 18
  • [8] Gradient-Based Discrete-Time Concurrent Learning for Standalone Function Approximation
    Djaneye-Boundjou, Ouboti
    Ordonez, Raul
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2020, 65 (02) : 749 - 756
  • [9] Optimal Tracking Control for Uncertain Nonlinear Systems With Prescribed Performance via Critic-Only ADP
    Dong, Hongyang
    Zhao, Xiaowei
    Luo, Biao
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2022, 52 (01): : 561 - 573
  • [10] Asymmetric integral BLF based state-constrained flight control using NN and DOB
    Guo, Yuyan
    Yan, Tian
    Xu, Bin
    Tao, Chenggang
    Sun, Shaoshan
    [J]. INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2022, 32 (05) : 3021 - 3038