Application of Reinforcement Learning in Controlling Quadrotor UAV Flight Actions

被引：2

作者：

Shen, Shang-En ^{[1
]}

Huang, Yi-Cheng ^{[1
]}

机构：

[1] Natl Chung Hsing Univ, Dept Mech Engn, Taichung 40227, Taiwan

来源：

DRONES | 2024年 / 8卷 / 11期

关键词：

quadrotor UAV; reinforcement learning; logic control; target recognition; action decision making;

D O I：

10.3390/drones8110660

中图分类号：

TP7 [遥感技术];

学科分类号：

081102 ; 0816 ; 081602 ; 083002 ; 1404 ;

摘要：

Most literature has extensively discussed reinforcement learning (RL) for controlling rotorcraft drones during flight for traversal tasks. However, most studies lack adequate details regarding the design of reward and punishment mechanisms, and there is a limited exploration of the feasibility of applying reinforcement learning in actual flight control following simulation experiments. Consequently, this study focuses on the exploration of reward and punishment design and state input for RL. The simulation environment is constructed using AirSim and Unreal Engine, with onboard camera footage serving as the state input for reinforcement learning. The research investigates three RL algorithms suitable for discrete action training. The Deep Q Network (DQN), Advantage Actor-Critic (A2C), and Proximal Policy Optimization (PPO) were combined with three different reward and punishment design mechanisms for training and testing. The results indicate that employing the PPO algorithm along with a continuous return method as the reward mechanism allows for effective convergence during the training process, achieving a target traversal rate of 71% in the testing environment. Furthermore, this study proposes integrating the YOLOv7-tiny object detection (OD) system to assess the applicability of reinforcement learning in real-world settings. Unifying the state inputs of simulated and OD environments and replacing the original simulated image inputs with a maximum dual-target approach, the experimental simulation achieved a target traversal rate of 52% ultimately. In summary, this research formulates a set of logical frameworks for an RL reward and punishment design deployed with real-time Yolo's OD implementation synergized as a useful aid for related RL studies.

引用

页数：25

共 50 条

[31] Robust Quadrotor Control through Reinforcement Learning with Disturbance Compensation
Pi, Chen-Huan
Ye, Wei-Yuan
Cheng, Stone
APPLIED SCIENCES-BASEL, 2021, 11 (07):
[32] Quadrotor UAV trajectory tracking based on iterative self learning control
Yang Li-ben
Zhang Wei-guo
Huang De-gang
PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON APPLIED SCIENCE AND ENGINEERING INNOVATION, 2015, 12 : 182 - 187
[33] Learning Autonomous Helicopter Flight with Evolutionary Reinforcement Learning
Antonio Martin H, Jose
de Lope, Javier
COMPUTER AIDED SYSTEMS THEORY - EUROCAST 2009, 2009, 5717 : 75 - +
[34] Energy-Optimal Flight Strategy for Solar-Powered Aircraft Using Reinforcement Learning With Discrete Actions
Ni, Wenjun
Wu, Di
Ma, Xiaoping
IEEE ACCESS, 2021, 9 : 95317 - 95334
[35] Reusability and Transferability of Macro Actions for Reinforcement Learning
Chang Y.-H.
Chang K.-Y.
Kuo H.
Lee C.-Y.
ACM Transactions on Evolutionary Learning and Optimization, 2022, 2 (01):
[36] Pyramid Representations of the Set of Actions in Reinforcement Learning
Iglesias, R.
Alvarez-Santos, V.
Rodriguez, M. A.
Santos-Saavedra, D.
Regueiro, C. V.
Pardo, X. M.
BIOINSPIRED COMPUTATION IN ARTIFICIAL SYSTEMS, PT II, 2015, 9108 : 203 - 212
[37] Biologically plausible reinforcement learning of continuous actions
Jaldert O Rombouts
Pieter R Roelfsema
Sander M Bohte
BMC Neuroscience, 14 (Suppl 1)
[38] Validation of the Flight Dynamics Engine of the X-Plane Simulator in Comparison with the Real Flight Data of the Quadrotor UAV Using CIFER
Do, Minh-Hoang
Lin, Chin-E
Lai, Ying-Chih
DRONES, 2023, 7 (09)
[39] An Efficient Reinforcement Learning Algorithm for Continuous Actions
Fu Bo
Chen Xin
He Yong
Wu Min
2013 25TH CHINESE CONTROL AND DECISION CONFERENCE (CCDC), 2013, : 80 - 85
[40] Learning Continuous Control Actions for Robotic Grasping with Reinforcement Learning
Shahid, Asad Ali
Roveda, Loris
Piga, Dario
Braghin, Francesco
2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 4066 - 4072

← 1 2 3 4 5 →