A Hybrid Human-in-the-Loop Deep Reinforcement Learning Method for UAV Motion Planning for Long Trajectories with Unpredictable Obstacles

被引:11
|
作者
Zhang, Sitong [1 ]
Li, Yibing [1 ]
Ye, Fang [2 ]
Geng, Xiaoyu [1 ]
Zhou, Zitao [1 ]
Shi, Tuo [3 ]
机构
[1] Harbin Engn Univ, Coll Informat & Commun Engn, Key Lab Adv Marine Commun & Informat Technol, Harbin 150001, Peoples R China
[2] Harbin Engn Univ, Coll Informat & Commun Engn, Natl Key Lab Underwater Acoust Technol, Harbin 150001, Peoples R China
[3] Tianjin Univ, Coll Intelligence & Comp, Sch Comp Sci & Technol, Tianjin 300350, Peoples R China
基金
中国国家自然科学基金;
关键词
unmanned aerial vehicles; collision avoidance; global path planning; DRL-based motion planning; RRT; NAVIGATION;
D O I
10.3390/drones7050311
中图分类号
TP7 [遥感技术];
学科分类号
081102 ; 0816 ; 081602 ; 083002 ; 1404 ;
摘要
Unmanned Aerial Vehicles (UAVs) can be an important component in the Internet of Things (IoT) ecosystem due to their ability to collect and transmit data from remote and hard-to-reach areas. Ensuring collision-free navigation for these UAVs is crucial in achieving this goal. However, existing UAV collision-avoidance methods face two challenges: conventional path-planning methods are energy-intensive and computationally demanding, while deep reinforcement learning (DRL)-based motion-planning methods are prone to make UAVs trapped in complex environments-especially for long trajectories with unpredictable obstacles-due to UAVs' limited sensing ability. To address these challenges, we propose a hybrid collision-avoidance method for the real-time navigation of UAVs in complex environments with unpredictable obstacles. We firstly develop a Human-in-the-Loop DRL (HL-DRL) training module for mapless obstacle avoidance and secondly establish a global-planning module that generates a few points as waypoint guidance. Moreover, a novel goal-updating algorithm is proposed to integrate the HL-DRL training module with the global-planning module by adaptively determining the to-be-reached waypoint. The proposed method is evaluated in different simulated environments. Results demonstrate that our approach can rapidly adapt to changes in environments with short replanning time and prevent the UAV from getting stuck in maze-like environments.
引用
收藏
页数:26
相关论文
共 32 条
  • [31] A Path Planning Method Based on Deep Reinforcement Learning with Improved Prioritized Experience Replay for Human-Robot Collaboration
    Sun, Deyu
    Wen, Jingqian
    Wang, Jingfei
    Yang, Xiaonan
    Hu, Yaoguang
    HUMAN-COMPUTER INTERACTION, PT II, HCI 2024, 2024, 14685 : 196 - 206
  • [32] A Long-Term Actor Network for Human-Like Car-Following Trajectory Planning Guided by Offline Sample-Based Deep Inverse Reinforcement Learning
    Nan, Jiangfeng
    Deng, Weiwen
    Zhang, Ruzheng
    Wang, Ying
    Ding, Juan
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2024, 21 (04) : 7094 - 7106