Low-level autonomous control and tracking of quadrotor using reinforcement learning

被引：62

作者：

Pi, Chen-Huan ^{[1
]}

Hu, Kai-Chun ^{[2
]}

Cheng, Stone ^{[1
]}

Wu, I-Chen ^{[3
,4
]}

机构：

[1] Natl Chiao Tung Univ, Dept Mech Engn, Hsinchu, Taiwan

[2] Natl Chiao Tung Univ, Dept Appl Math, Hsinchu, Taiwan

[3] Natl Chiao Tung Univ, Dept Comp Sci, Hsinchu, Taiwan

[4] Pervasive Artif Intalligence Res PAIR Labs, Hsinchu, Taiwan

来源：

CONTROL ENGINEERING PRACTICE | 2020年 / 95卷

关键词：

Reinforcement learning; Policy gradient; Quadrotor; NETWORKS; GAME; GO;

D O I：

10.1016/j.conengprac.2019.104222

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper proposes a low-level quadrotor control algorithm using neural networks with model-free reinforcement learning, then explores the algorithm's capabilities on quadrotor hover and tracking tasks. We provide a new point of view by examining the well-known policy gradient algorithm from reinforcement learning, then relaxing its requirements to improve training efficiency. Without requiring expert demonstrations, the improved algorithm is then applied to train a quadrotor controller with its output directly mapped to four actuators in a simulator, which is a technique used to control any linear or nonlinear system under unknown dynamic parameters and disturbances. We show two experimental tasks both in simulation and real-world quadrotors to verify our method and demonstrate performance: 1) hovering at a fixed position, and 2) tracking along a specific trajectory.

引用

页数：11

共 35 条

[1]

Alexis K., 2011, 2011 IEEE 20th International Symposium on Industrial Electronics (ISIE 2011), P2243, DOI 10.1109/ISIE.2011.5984510

[2]

Alexis K., 2011, Switching model predictive attitude control for a quadrotor helicopter.pdf

[3]

Alexis K. H. L. au, 2012, MODEL PREDICTIVE QUA

[4]

Bouabdallah S, 2005, IEEE INT CONF ROBOT, P2247

[5]

Castillo Alberto, 2019, DISTURBANCE OBSERVER

[6]

Chovancova AneSka Fico Toma Chovanec tubovg & Hubinsk Peter, 2014, MATH MODELLING PARAM

[7]

Degris T, 2012, P 29 INT COFERENCE I, P179

[8] Adaptive Control of Quadrotor UAVs: A Design Trade Study With Flight Evaluations [J].

Dydek, Zachary T. ;

Annaswamy, Anuradha M. ;

Lavretsky, Eugene .

IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, 2013, 21 (04) :1400-1406

[9] MULTILAYER FEEDFORWARD NETWORKS ARE UNIVERSAL APPROXIMATORS [J].

HORNIK, K ;

STINCHCOMBE, M ;

WHITE, H .

NEURAL NETWORKS, 1989, 2 (05) :359-366

[10] Control of a Quadrotor With Reinforcement Learning [J].

Hwangbo, Jemin ;

Sa, Inkyu ;

Siegwart, Roland ;

Hutter, Marco .

IEEE ROBOTICS AND AUTOMATION LETTERS, 2017, 2 (04) :2096-2103

← 1 2 3 4 →