Reinforcement Learning for UAV Attitude Control

被引：293

作者：

Koch, William ^{[1
]}

Mancuso, Renato ^{[1
]}

West, Richard ^{[1
]}

Bestavros, Azer ^{[1
]}

机构：

[1] Boston Univ, Dept Comp Sci, 111 Cummington Mall, Boston, MA 02215 USA

来源：

ACM TRANSACTIONS ON CYBER-PHYSICAL SYSTEMS | 2019年 / 3卷 / 02期

基金：

美国国家科学基金会;

关键词：

Attitude control; UAV; reinforcement learning; quadcopter; autopilot; machine learning; PID; intelligent control; adaptive control; QUADROTOR;

D O I：

10.1145/3301273

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Autopilot systems are typically composed of an "inner loop" providing stability and control, whereas an "outer loop" is responsible for mission-level objectives, such as way-point navigation. Autopilot systems for unmanned aerial vehicles are predominately implemented using Proportional-Integral-Derivative (PID) control systems, which have demonstrated exceptional performance in stable environments. However, more sophisticated control is required to operate in unpredictable and harsh environments. Intelligent flight control systems is an active area of research addressing limitations of PID control most recently through the use of reinforcement learning (RL), which has had success in other applications, such as robotics. Yet previous work has focused primarily on using RL at the mission-level controller. In this work, we investigate the performance and accuracy of the inner control loop providing attitude control when using intelligent flight control systems trained with state-of-the-art RL algorithms-Deep Deterministic Policy Gradient, Trust Region Policy Optimization, and Proximal Policy Optimization. To investigate these unknowns, we first developed an open source high-fidelity simulation environment to train a flight controller attitude control of a quadrotor through RL. We then used our environment to compare their performance to that of a PID controller to identify if using RL is appropriate in high-precision, time-critical flight control.

引用

页数：21

共 40 条

[1]

Abbeel P., 2007, Advances in neural information processing systems, V19, P1

[2]

[Anonymous], BETAFLIGHT

[3]

[Anonymous], GZSERV DOESN CLOS DI

[4]

[Anonymous], P JANAFF INT PROP CO

[5]

[Anonymous], PROCEEDINGS OF THE 2

[6]

[Anonymous], 2017, IEEE T AUTOM SCI ENG

[7]

[Anonymous], P INF AER C

[8]

[Anonymous], 1998, REINFORCEMENT LEARNI

[9]

[Anonymous], ARDUPILOT HOM PAG

[10]

[Anonymous], P 2014 AAAI SPRING S

← 1 2 3 4 →