REINFORCEMENT LEARNING BASED LINEAR QUADRATIC REGULATOR FOR THE CONTROL OF A QUADCOPTER

被引：0

作者：

Kashyap, Vishal ^{[1
]}

Vepa, Ranjan ^{[2
]}

机构：

[1] Queen Mary Univ London, Sch Engn & Mat Sci, London E14NS, England

[2] Queen Mary Univ London, Aerosp Engn, Sch Engn & Mat Sci, London E14NS, England

来源：

AIAA SCITECH 2023 FORUM | 2023年

关键词：

D O I：

暂无

中图分类号：

V [航空、航天];

学科分类号：

08 ; 0825 ;

摘要：

In a practical implementation of the Linear Quadratic Regulator (LQR) for the control of a quadrotor drone, a key problem is the choice of the state and control weighting matrices. In this paper, we propose a Reinforcement Learning based LQR for quadrotor control. Leveraging the advances in deep reinforcement learning, Deep Deterministic Policy Gradient (DDPG) model is used to reset the elements of the Q matrix to achieve a faster response while minimizing the integral square error (ISE). Following the properties of the LQR control law, the LQR-DDPG controller is optimal and asymptotically stable. The proposed controller is compared with four other extensively used methods to choose the Q matrix. In the first method, that Q matrix is initially selected to be an identity matrix. Bryson's rule is used to set the Q matrix in the second method but not updated subsequently. Similar to the second method, the third method uses Bryson's rule to set the Q matrix but uses a proportional derivative controller along with LQR. The third method uses an iterative optimization algorithm that minimizes the integral square error (ISE) over the training trajectories to select the Q matrix. The simulation results show that LQR-DDPG is better than all benchmarking cases in terms of rise time, settling time and time of flight, all by a margin of 10% or more.

引用

页数：11

共 34 条

[1]

Araar O, 2014, 2014 UKACC INTERNATIONAL CONFERENCE ON CONTROL (CONTROL), P133, DOI 10.1109/CONTROL.2014.6915128

[2]

Branch S. T., 2011, International Journal of Intelligent Information Processing, V2, P74

[3]

Bryson A., 1969, APPL OPTIMAL CONTROL

[4]

Deng XF, 2017, CHIN CONT DECIS CONF, P832, DOI 10.1109/CCDC.2017.7978635

[5]

Ghoreishi S. A., 2011, Int. J. of Intelligent Information Processing, P74

[6] A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients [J].

Grondman, Ivo ;

Busoniu, Lucian ;

Lopes, Gabriel A. D. ;

Babuska, Robert .

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2012, 42 (06) :1291-1307

[7]

Hespanha JP, 2009, LINEAR SYSTEMS THEORY, P204

[8]

Ioffe S, 2015, Arxiv, DOI arXiv:1502.03167

[9]

Jacknoon A, 2017, 2017 INTERNATIONAL CONFERENCE ON COMMUNICATION, CONTROL, COMPUTING AND ELECTRONICS ENGINEERING (ICCCCEE)

[10]

Kalman R.E., 1960, BOL SOC MAT MEX, V5, P102

← 1 2 3 4 →