Robust Control Strategy for Quadrotor Drone Using Reference Model-Based Deep Deterministic Policy Gradient

被引：4

作者：

Liu, Hongxun ^{[1
]}

Suzuki, Satoshi ^{[1
]}

Wang, Wei ^{[2
]}

Liu, Hao ^{[1
]}

Wang, Qi ^{[1
]}

机构：

[1] Chiba Univ, Grad Sch Sci & Engn, Chiba 2638522, Japan

[2] Nanjing Univ Informat Sci & Technol, Jiangsu Collaborat Innovat Ctr Atmospher Environm, Nanjing 210044, Peoples R China

来源：

DRONES | 2022年 / 6卷 / 09期

关键词：

reinforcement learning; quadrotor drone; deterministic policy; neural network;

D O I：

10.3390/drones6090251

中图分类号：

TP7 [遥感技术];

学科分类号：

081102 ; 0816 ; 081602 ; 083002 ; 1404 ;

摘要：

Due to the differences between simulations and the real world, the application of reinforcement learning (RL) in drone control encounters problems such as oscillations and instability. This study proposes a control strategy for quadrotor drones using a reference model (RM) based on deep RL. Unlike the conventional studies associated with optimal and adaptive control, this method uses a deep neural network to design a flight controller for quadrotor drones, which can map the drone's states and target values to control commands directly. The method was developed based on a deep deterministic policy gradient (DDPG) algorithm combined with the deep neural network. The RM was further employed for the actor-critic structure to enhance the robustness and dynamic stability. The RM-DDPG-based flight-control strategy was confirmed to be practicable through a two-fold experiment. First, a quadrotor drone model was constructed based on an actual drone, and the offline policy was trained on it. The performance of the policy was evaluated via simulations while confirming the transition of system states and the output of the controller. The proposed strategy can eliminate oscillations and steady error and can achieve robust results for the target value and external interference.

引用

页数：18

共 27 条

[1] Barrier Function-Based Nonsingular Finite-Time Tracker for Quadrotor UAVs Subject to Uncertainties and Input Constraints [J].

Alattas, Khalid A. ;

Vu, Mai The ;

Mofid, Omid ;

El-Sousy, Fayez F. M. ;

Fekih, Afef ;

Mobayen, Saleh .

MATHEMATICS, 2022, 10 (10)

[2]

An C., 2021, J PHYS C SER, V1915

[3]

Camacho Eduardo F, 2013, Model predictive control

[4]

Cowling Ian D., 2007, Proceedings of the European Control Conference 2007 (ECC), P4001

[5] An innovative bio-inspired flight controller for quad-rotor drones: Quad-rotor drone learning to fly using reinforcement learning [J].

Dooraki, Amir Ramezani ;

Lee, Deok-Jin .

ROBOTICS AND AUTONOMOUS SYSTEMS, 2021, 135

[6] A method for autonomous collision-free navigation of a quadrotor UAV in unknown tunnel-like environments [J].

Elmokadem, Taha ;

Savkin, Andrey V. .

ROBOTICA, 2022, 40 (04) :835-861

[7] Mathematical Investigation on the Sustainability of UAV Logistics [J].

Eun, Joonyup ;

Song, Byung Duk ;

Lee, Sangbok ;

Lim, Dae-Eun .

SUSTAINABILITY, 2019, 11 (21)

[8] Reinforcement Learning Guided by Double Replay Memory [J].

Han, Jiseong ;

Jo, Kichun ;

Lim, Wontaek ;

Lee, Yonghak ;

Ko, Kyoungmin ;

Sim, Eunseon ;

Cho, JunSang ;

Kim, SungHwan .

JOURNAL OF SENSORS, 2021, 2021

[9]

Hoang VT, 2017, ASIA CONTROL CONF AS, P671, DOI 10.1109/ASCC.2017.8287250

[10] Control of a Quadrotor With Reinforcement Learning [J].

Hwangbo, Jemin ;

Sa, Inkyu ;

Siegwart, Roland ;

Hutter, Marco .

IEEE ROBOTICS AND AUTOMATION LETTERS, 2017, 2 (04) :2096-2103

← 1 2 3 →