Low-level autonomous control and tracking of quadrotor using reinforcement learning

被引:62
作者
Pi, Chen-Huan [1 ]
Hu, Kai-Chun [2 ]
Cheng, Stone [1 ]
Wu, I-Chen [3 ,4 ]
机构
[1] Natl Chiao Tung Univ, Dept Mech Engn, Hsinchu, Taiwan
[2] Natl Chiao Tung Univ, Dept Appl Math, Hsinchu, Taiwan
[3] Natl Chiao Tung Univ, Dept Comp Sci, Hsinchu, Taiwan
[4] Pervasive Artif Intalligence Res PAIR Labs, Hsinchu, Taiwan
关键词
Reinforcement learning; Policy gradient; Quadrotor; NETWORKS; GAME; GO;
D O I
10.1016/j.conengprac.2019.104222
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes a low-level quadrotor control algorithm using neural networks with model-free reinforcement learning, then explores the algorithm's capabilities on quadrotor hover and tracking tasks. We provide a new point of view by examining the well-known policy gradient algorithm from reinforcement learning, then relaxing its requirements to improve training efficiency. Without requiring expert demonstrations, the improved algorithm is then applied to train a quadrotor controller with its output directly mapped to four actuators in a simulator, which is a technique used to control any linear or nonlinear system under unknown dynamic parameters and disturbances. We show two experimental tasks both in simulation and real-world quadrotors to verify our method and demonstrate performance: 1) hovering at a fixed position, and 2) tracking along a specific trajectory.
引用
收藏
页数:11
相关论文
共 35 条
[1]  
Alexis K., 2011, 2011 IEEE 20th International Symposium on Industrial Electronics (ISIE 2011), P2243, DOI 10.1109/ISIE.2011.5984510
[2]  
Alexis K., 2011, Switching model predictive attitude control for a quadrotor helicopter.pdf
[3]  
Alexis K. H. L. au, 2012, MODEL PREDICTIVE QUA
[4]  
Bouabdallah S, 2005, IEEE INT CONF ROBOT, P2247
[5]  
Castillo Alberto, 2019, DISTURBANCE OBSERVER
[6]  
Chovancova AneSka Fico Toma Chovanec tubovg & Hubinsk Peter, 2014, MATH MODELLING PARAM
[7]  
Degris T, 2012, P 29 INT COFERENCE I, P179
[8]   Adaptive Control of Quadrotor UAVs: A Design Trade Study With Flight Evaluations [J].
Dydek, Zachary T. ;
Annaswamy, Anuradha M. ;
Lavretsky, Eugene .
IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, 2013, 21 (04) :1400-1406
[9]   MULTILAYER FEEDFORWARD NETWORKS ARE UNIVERSAL APPROXIMATORS [J].
HORNIK, K ;
STINCHCOMBE, M ;
WHITE, H .
NEURAL NETWORKS, 1989, 2 (05) :359-366
[10]   Control of a Quadrotor With Reinforcement Learning [J].
Hwangbo, Jemin ;
Sa, Inkyu ;
Siegwart, Roland ;
Hutter, Marco .
IEEE ROBOTICS AND AUTOMATION LETTERS, 2017, 2 (04) :2096-2103