Application of Reinforcement Learning in Controlling Quadrotor UAV Flight Actions

被引:2
|
作者
Shen, Shang-En [1 ]
Huang, Yi-Cheng [1 ]
机构
[1] Natl Chung Hsing Univ, Dept Mech Engn, Taichung 40227, Taiwan
关键词
quadrotor UAV; reinforcement learning; logic control; target recognition; action decision making;
D O I
10.3390/drones8110660
中图分类号
TP7 [遥感技术];
学科分类号
081102 ; 0816 ; 081602 ; 083002 ; 1404 ;
摘要
Most literature has extensively discussed reinforcement learning (RL) for controlling rotorcraft drones during flight for traversal tasks. However, most studies lack adequate details regarding the design of reward and punishment mechanisms, and there is a limited exploration of the feasibility of applying reinforcement learning in actual flight control following simulation experiments. Consequently, this study focuses on the exploration of reward and punishment design and state input for RL. The simulation environment is constructed using AirSim and Unreal Engine, with onboard camera footage serving as the state input for reinforcement learning. The research investigates three RL algorithms suitable for discrete action training. The Deep Q Network (DQN), Advantage Actor-Critic (A2C), and Proximal Policy Optimization (PPO) were combined with three different reward and punishment design mechanisms for training and testing. The results indicate that employing the PPO algorithm along with a continuous return method as the reward mechanism allows for effective convergence during the training process, achieving a target traversal rate of 71% in the testing environment. Furthermore, this study proposes integrating the YOLOv7-tiny object detection (OD) system to assess the applicability of reinforcement learning in real-world settings. Unifying the state inputs of simulated and OD environments and replacing the original simulated image inputs with a maximum dual-target approach, the experimental simulation achieved a target traversal rate of 52% ultimately. In summary, this research formulates a set of logical frameworks for an RL reward and punishment design deployed with real-time Yolo's OD implementation synergized as a useful aid for related RL studies.
引用
收藏
页数:25
相关论文
共 50 条
  • [31] Robust Quadrotor Control through Reinforcement Learning with Disturbance Compensation
    Pi, Chen-Huan
    Ye, Wei-Yuan
    Cheng, Stone
    APPLIED SCIENCES-BASEL, 2021, 11 (07):
  • [32] Quadrotor UAV trajectory tracking based on iterative self learning control
    Yang Li-ben
    Zhang Wei-guo
    Huang De-gang
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON APPLIED SCIENCE AND ENGINEERING INNOVATION, 2015, 12 : 182 - 187
  • [33] Learning Autonomous Helicopter Flight with Evolutionary Reinforcement Learning
    Antonio Martin H, Jose
    de Lope, Javier
    COMPUTER AIDED SYSTEMS THEORY - EUROCAST 2009, 2009, 5717 : 75 - +
  • [34] Energy-Optimal Flight Strategy for Solar-Powered Aircraft Using Reinforcement Learning With Discrete Actions
    Ni, Wenjun
    Wu, Di
    Ma, Xiaoping
    IEEE ACCESS, 2021, 9 : 95317 - 95334
  • [35] Reusability and Transferability of Macro Actions for Reinforcement Learning
    Chang Y.-H.
    Chang K.-Y.
    Kuo H.
    Lee C.-Y.
    ACM Transactions on Evolutionary Learning and Optimization, 2022, 2 (01):
  • [36] Pyramid Representations of the Set of Actions in Reinforcement Learning
    Iglesias, R.
    Alvarez-Santos, V.
    Rodriguez, M. A.
    Santos-Saavedra, D.
    Regueiro, C. V.
    Pardo, X. M.
    BIOINSPIRED COMPUTATION IN ARTIFICIAL SYSTEMS, PT II, 2015, 9108 : 203 - 212
  • [37] Biologically plausible reinforcement learning of continuous actions
    Jaldert O Rombouts
    Pieter R Roelfsema
    Sander M Bohte
    BMC Neuroscience, 14 (Suppl 1)
  • [38] Validation of the Flight Dynamics Engine of the X-Plane Simulator in Comparison with the Real Flight Data of the Quadrotor UAV Using CIFER
    Do, Minh-Hoang
    Lin, Chin-E
    Lai, Ying-Chih
    DRONES, 2023, 7 (09)
  • [39] An Efficient Reinforcement Learning Algorithm for Continuous Actions
    Fu Bo
    Chen Xin
    He Yong
    Wu Min
    2013 25TH CHINESE CONTROL AND DECISION CONFERENCE (CCDC), 2013, : 80 - 85
  • [40] Learning Continuous Control Actions for Robotic Grasping with Reinforcement Learning
    Shahid, Asad Ali
    Roveda, Loris
    Piga, Dario
    Braghin, Francesco
    2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 4066 - 4072