Aero-engine acceleration control using deep reinforcement learning with phase-based reward function

被引：12

作者：

Hu, Qian-Kun ^{[1
]}

Zhao, Yong-Ping ^{[1
]}

机构：

[1] Nanjing Univ Aeronaut & Astronaut, Coll Energy & Power Engn, 29 Yudao St, Nanjing 210016, Peoples R China

来源：

PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART G-JOURNAL OF AEROSPACE ENGINEERING | 2022年 / 236卷 / 09期

关键词：

Aero-engine acceleration control; deep reinforcement learning; feedback control system; reward function design;

D O I：

10.1177/09544100211046225

中图分类号：

V [航空、航天];

学科分类号：

08 ; 0825 ;

摘要：

In this paper, the conventional aero-engine acceleration control task is formulated into a Markov Decision Process (MDP) problem. Then, a novel phase-based reward function is proposed to enhance the performance of deep reinforcement learning (DRL) in solving feedback control tasks. With that reward function, an aero-engine controller based on Trust Region Policy Optimization (TRPO) is developed to improve the aero-engine acceleration performance. Four comparison simulations were conducted to verify the effectiveness of the proposed methods. The simulation results show that the phase-based reward function helps to eliminate the oscillation problem of the aero-engine control system, which is caused by the traditional goal-based reward function when DRL is applied to the aero-engine control. And the TRPO controller outperforms deep Q-learning (DQN) and the proportional-integral-derivative (PID) in the aero-engine acceleration control task. Compared to DQN and PID controller, the acceleration time of aero-engine is decreased by 0.6 and 2.58 s, respectively, and the aero-engine acceleration performance is improved by 16.8 and 46.4% each.

引用

页码：1878 / 1894

页数：17

共 24 条

[1]

[Anonymous], 2010, MODERN CONTROL SYSTE

[2]

[Anonymous], 2015, ACS SYM SER

[3] A review on control system architecture of a SI engine management system [J].

Ashok, B. ;

Ashok, S. Denis ;

Kumar, C. Ramesh .

ANNUAL REVIEWS IN CONTROL, 2016, 41 :94-118

[4] The future of PID control [J].

Åström, KJ ;

Hägglund, T .

CONTROL ENGINEERING PRACTICE, 2001, 9 (11) :1163-1175

[5]

Burden R.L., 2011, Numerical Analysis

[6] Reinforcement learning for angle-only intercept guidance of maneuvering targets [J].

Gaudet, Brian ;

Furfaro, Roberto ;

Linares, Richard .

AEROSPACE SCIENCE AND TECHNOLOGY, 2020, 99

[7]

Jaw L.C., 2009, Aircraft Engine Controls

[8] ON INFORMATION AND SUFFICIENCY [J].

KULLBACK, S ;

LEIBLER, RA .

ANNALS OF MATHEMATICAL STATISTICS, 1951, 22 (01) :79-86

[9]

Kullback S., 1978, INFORM THEORY STAT

[10] Reinforcement learning based two-level control framework of UAV swarm for cooperative persistent surveillance in an unknown urban area [J].

Liu, Yuxuan ;

Liu, Hu ;

Tian, Yongliang ;

Sun, Cong .

AEROSPACE SCIENCE AND TECHNOLOGY, 2020, 98

← 1 2 3 →