REINFORCEMENT LEARNING BASED LINEAR QUADRATIC REGULATOR FOR THE CONTROL OF A QUADCOPTER

被引:0
作者
Kashyap, Vishal [1 ]
Vepa, Ranjan [2 ]
机构
[1] Queen Mary Univ London, Sch Engn & Mat Sci, London E14NS, England
[2] Queen Mary Univ London, Aerosp Engn, Sch Engn & Mat Sci, London E14NS, England
来源
AIAA SCITECH 2023 FORUM | 2023年
关键词
D O I
暂无
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
In a practical implementation of the Linear Quadratic Regulator (LQR) for the control of a quadrotor drone, a key problem is the choice of the state and control weighting matrices. In this paper, we propose a Reinforcement Learning based LQR for quadrotor control. Leveraging the advances in deep reinforcement learning, Deep Deterministic Policy Gradient (DDPG) model is used to reset the elements of the Q matrix to achieve a faster response while minimizing the integral square error (ISE). Following the properties of the LQR control law, the LQR-DDPG controller is optimal and asymptotically stable. The proposed controller is compared with four other extensively used methods to choose the Q matrix. In the first method, that Q matrix is initially selected to be an identity matrix. Bryson's rule is used to set the Q matrix in the second method but not updated subsequently. Similar to the second method, the third method uses Bryson's rule to set the Q matrix but uses a proportional derivative controller along with LQR. The third method uses an iterative optimization algorithm that minimizes the integral square error (ISE) over the training trajectories to select the Q matrix. The simulation results show that LQR-DDPG is better than all benchmarking cases in terms of rise time, settling time and time of flight, all by a margin of 10% or more.
引用
收藏
页数:11
相关论文
共 34 条
[1]  
Araar O, 2014, 2014 UKACC INTERNATIONAL CONFERENCE ON CONTROL (CONTROL), P133, DOI 10.1109/CONTROL.2014.6915128
[2]  
Branch S. T., 2011, International Journal of Intelligent Information Processing, V2, P74
[3]  
Bryson A., 1969, APPL OPTIMAL CONTROL
[4]  
Deng XF, 2017, CHIN CONT DECIS CONF, P832, DOI 10.1109/CCDC.2017.7978635
[5]  
Ghoreishi S. A., 2011, Int. J. of Intelligent Information Processing, P74
[6]   A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients [J].
Grondman, Ivo ;
Busoniu, Lucian ;
Lopes, Gabriel A. D. ;
Babuska, Robert .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2012, 42 (06) :1291-1307
[7]  
Hespanha JP, 2009, LINEAR SYSTEMS THEORY, P204
[8]  
Ioffe S, 2015, Arxiv, DOI arXiv:1502.03167
[9]  
Jacknoon A, 2017, 2017 INTERNATIONAL CONFERENCE ON COMMUNICATION, CONTROL, COMPUTING AND ELECTRONICS ENGINEERING (ICCCCEE)
[10]  
Kalman R.E., 1960, BOL SOC MAT MEX, V5, P102