A Hybrid Human-in-the-Loop Deep Reinforcement Learning Method for UAV Motion Planning for Long Trajectories with Unpredictable Obstacles

被引：11

作者：

Zhang, Sitong ^{[1
]}

Li, Yibing ^{[1
]}

Ye, Fang ^{[2
]}

Geng, Xiaoyu ^{[1
]}

Zhou, Zitao ^{[1
]}

Shi, Tuo ^{[3
]}

机构：

[1] Harbin Engn Univ, Coll Informat & Commun Engn, Key Lab Adv Marine Commun & Informat Technol, Harbin 150001, Peoples R China

[2] Harbin Engn Univ, Coll Informat & Commun Engn, Natl Key Lab Underwater Acoust Technol, Harbin 150001, Peoples R China

[3] Tianjin Univ, Coll Intelligence & Comp, Sch Comp Sci & Technol, Tianjin 300350, Peoples R China

来源：

DRONES | 2023年 / 7卷 / 05期

基金：

中国国家自然科学基金;

关键词：

unmanned aerial vehicles; collision avoidance; global path planning; DRL-based motion planning; RRT; NAVIGATION;

D O I：

10.3390/drones7050311

中图分类号：

TP7 [遥感技术];

学科分类号：

081102 ; 0816 ; 081602 ; 083002 ; 1404 ;

摘要：

Unmanned Aerial Vehicles (UAVs) can be an important component in the Internet of Things (IoT) ecosystem due to their ability to collect and transmit data from remote and hard-to-reach areas. Ensuring collision-free navigation for these UAVs is crucial in achieving this goal. However, existing UAV collision-avoidance methods face two challenges: conventional path-planning methods are energy-intensive and computationally demanding, while deep reinforcement learning (DRL)-based motion-planning methods are prone to make UAVs trapped in complex environments-especially for long trajectories with unpredictable obstacles-due to UAVs' limited sensing ability. To address these challenges, we propose a hybrid collision-avoidance method for the real-time navigation of UAVs in complex environments with unpredictable obstacles. We firstly develop a Human-in-the-Loop DRL (HL-DRL) training module for mapless obstacle avoidance and secondly establish a global-planning module that generates a few points as waypoint guidance. Moreover, a novel goal-updating algorithm is proposed to integrate the HL-DRL training module with the global-planning module by adaptively determining the to-be-reached waypoint. The proposed method is evaluated in different simulated environments. Results demonstrate that our approach can rapidly adapt to changes in environments with short replanning time and prevent the UAV from getting stuck in maze-like environments.

引用

页数：26

共 32 条

[31] A Path Planning Method Based on Deep Reinforcement Learning with Improved Prioritized Experience Replay for Human-Robot Collaboration
Sun, Deyu
Wen, Jingqian
Wang, Jingfei
Yang, Xiaonan
Hu, Yaoguang
HUMAN-COMPUTER INTERACTION, PT II, HCI 2024, 2024, 14685 : 196 - 206
[32] A Long-Term Actor Network for Human-Like Car-Following Trajectory Planning Guided by Offline Sample-Based Deep Inverse Reinforcement Learning
Nan, Jiangfeng
Deng, Weiwen
Zhang, Ruzheng
Wang, Ying
Ding, Juan
IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2024, 21 (04) : 7094 - 7106

← 1 2 3 4 →