Hierarchical path planner for unknown space exploration using reinforcement learning-based intelligent frontier selection

被引：12

作者：

Fan, Jie ^{[1
]}

Zhang, Xudong ^{[1
]}

Zou, Yuan ^{[1
]}

机构：

[1] Beijing Inst Technol, Natl Engn Res Ctr Elect Vehicles, Sch Mech Engn, Beijing 100081, Peoples R China

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2023年 / 230卷

关键词：

Path planning; Trust region policy optimization; Edge detection; Unknown space exploration;

D O I：

10.1016/j.eswa.2023.120630

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Path planning in unknown environments is extremely useful for some specific tasks, such as exploration of outer space planets, search and rescue in disaster areas, home sweeping services, etc. However, existing frontier-based path planners suffer from insufficient exploration, while reinforcement learning (RL)-based ones are confronted with problems in efficient training and effective searching. To overcome the above problems, this paper proposes a novel hierarchical path planner for unknown space exploration using RL-based intelligent frontier selection. Firstly, by decomposing the path planner into three-layered architecture (including the perception layer, planning layer, and control layer) and using edge detection to find potential frontiers to track, the path search space is shrunk from the whole map to a handful of points of interest, which significantly saves the computational resources in both training and execution processes. Secondly, one of the advanced RL algorithms, trust region policy optimization (TRPO), is used as a judge to select the best frontier for the robot to track, which ensures the optimality of the path planner with a shorter path length. The proposed method is validated through simulation and compared with both classic and state-of-the-art methods. Results show that the training process could be greatly accelerated compared with the traditional deep-Q network (DQN). Moreover, the proposed method has 4.2%-14.3% improvement in exploration region rate and achieves the highest exploration completeness.

引用

页数：17

共 45 条

[41] Explore-Bench: Data Sets, Metrics and Evaluations for Frontier-based and Deep-reinforcement-learning-based Autonomous Exploration [J].

Xu, Yuanfan ;

Yu, Jincheng ;

Tang, Jiahao ;

Qiu, Jiantao ;

Wang, Jian ;

Shen, Yuan ;

Wang, Yu ;

Yang, Huazhong .

2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2022, 2022, :6225-6231

[42] An optimal goal point determination algorithm for automatic navigation of agricultural machinery: Improving the tracking accuracy of the Pure Pursuit algorithm [J].

Yang, Yang ;

Li, Yankai ;

Wen, Xing ;

Zhang, Gang ;

Ma, Qianglong ;

Cheng, Shangkun ;

Qi, Jian ;

Xu, Liangyuan ;

Chen, Liqing .

COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2022, 194

[43] A constrained differential evolution algorithm to solve UAV path planning in disaster scenarios [J].

Yu, Xiaobing ;

Li, Chenliang ;

Zhou, JiaFang .

KNOWLEDGE-BASED SYSTEMS, 2020, 204

[44] A multi-robot cooperative exploration algorithm considering working efficiency and working load [J].

Zhao, Meng ;

Lu, Hui ;

Cheng, Shi ;

Yang, Siyi ;

Shi, Yuhui .

APPLIED SOFT COMPUTING, 2022, 128

[45] FUEL: Fast UAV Exploration Using Incremental Frontier Structure and Hierarchical Planning [J].

Zhou, Boyu ;

Zhang, Yichen ;

Chen, Xinyi ;

Shen, Shaojie .

IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (02) :779-786

← 1 2 3 4 5 →