Aero-engine acceleration control using deep reinforcement learning with phase-based reward function

被引:12
作者
Hu, Qian-Kun [1 ]
Zhao, Yong-Ping [1 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Energy & Power Engn, 29 Yudao St, Nanjing 210016, Peoples R China
关键词
Aero-engine acceleration control; deep reinforcement learning; feedback control system; reward function design;
D O I
10.1177/09544100211046225
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
In this paper, the conventional aero-engine acceleration control task is formulated into a Markov Decision Process (MDP) problem. Then, a novel phase-based reward function is proposed to enhance the performance of deep reinforcement learning (DRL) in solving feedback control tasks. With that reward function, an aero-engine controller based on Trust Region Policy Optimization (TRPO) is developed to improve the aero-engine acceleration performance. Four comparison simulations were conducted to verify the effectiveness of the proposed methods. The simulation results show that the phase-based reward function helps to eliminate the oscillation problem of the aero-engine control system, which is caused by the traditional goal-based reward function when DRL is applied to the aero-engine control. And the TRPO controller outperforms deep Q-learning (DQN) and the proportional-integral-derivative (PID) in the aero-engine acceleration control task. Compared to DQN and PID controller, the acceleration time of aero-engine is decreased by 0.6 and 2.58 s, respectively, and the aero-engine acceleration performance is improved by 16.8 and 46.4% each.
引用
收藏
页码:1878 / 1894
页数:17
相关论文
共 24 条
[1]  
[Anonymous], 2010, MODERN CONTROL SYSTE
[2]  
[Anonymous], 2015, ACS SYM SER
[3]   A review on control system architecture of a SI engine management system [J].
Ashok, B. ;
Ashok, S. Denis ;
Kumar, C. Ramesh .
ANNUAL REVIEWS IN CONTROL, 2016, 41 :94-118
[4]   The future of PID control [J].
Åström, KJ ;
Hägglund, T .
CONTROL ENGINEERING PRACTICE, 2001, 9 (11) :1163-1175
[5]  
Burden R.L., 2011, Numerical Analysis
[6]   Reinforcement learning for angle-only intercept guidance of maneuvering targets [J].
Gaudet, Brian ;
Furfaro, Roberto ;
Linares, Richard .
AEROSPACE SCIENCE AND TECHNOLOGY, 2020, 99
[7]  
Jaw L.C., 2009, Aircraft Engine Controls
[8]   ON INFORMATION AND SUFFICIENCY [J].
KULLBACK, S ;
LEIBLER, RA .
ANNALS OF MATHEMATICAL STATISTICS, 1951, 22 (01) :79-86
[9]  
Kullback S., 1978, INFORM THEORY STAT
[10]   Reinforcement learning based two-level control framework of UAV swarm for cooperative persistent surveillance in an unknown urban area [J].
Liu, Yuxuan ;
Liu, Hu ;
Tian, Yongliang ;
Sun, Cong .
AEROSPACE SCIENCE AND TECHNOLOGY, 2020, 98