Hierarchical path planner for unknown space exploration using reinforcement learning-based intelligent frontier selection

被引:12
作者
Fan, Jie [1 ]
Zhang, Xudong [1 ]
Zou, Yuan [1 ]
机构
[1] Beijing Inst Technol, Natl Engn Res Ctr Elect Vehicles, Sch Mech Engn, Beijing 100081, Peoples R China
关键词
Path planning; Trust region policy optimization; Edge detection; Unknown space exploration;
D O I
10.1016/j.eswa.2023.120630
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Path planning in unknown environments is extremely useful for some specific tasks, such as exploration of outer space planets, search and rescue in disaster areas, home sweeping services, etc. However, existing frontier-based path planners suffer from insufficient exploration, while reinforcement learning (RL)-based ones are confronted with problems in efficient training and effective searching. To overcome the above problems, this paper proposes a novel hierarchical path planner for unknown space exploration using RL-based intelligent frontier selection. Firstly, by decomposing the path planner into three-layered architecture (including the perception layer, planning layer, and control layer) and using edge detection to find potential frontiers to track, the path search space is shrunk from the whole map to a handful of points of interest, which significantly saves the computational resources in both training and execution processes. Secondly, one of the advanced RL algorithms, trust region policy optimization (TRPO), is used as a judge to select the best frontier for the robot to track, which ensures the optimality of the path planner with a shorter path length. The proposed method is validated through simulation and compared with both classic and state-of-the-art methods. Results show that the training process could be greatly accelerated compared with the traditional deep-Q network (DQN). Moreover, the proposed method has 4.2%-14.3% improvement in exploration region rate and achieves the highest exploration completeness.
引用
收藏
页数:17
相关论文
共 45 条
[1]   Adaptive Control and Intersections with Reinforcement Learning [J].
Annaswamy, Anuradha M. .
ANNUAL REVIEW OF CONTROL ROBOTICS AND AUTONOMOUS SYSTEMS, 2023, 6 :65-93
[2]  
Birkin, 2016, EUROPEAN CLEANING J
[3]   Safe Learning in Robotics: From Learning-Based Control to Safe Reinforcement Learning [J].
Brunke, Lukas ;
Greeff, Melissa ;
Hall, Adam W. ;
Yuan, Zhaocong ;
Zhou, Siqi ;
Panerati, Jacopo ;
Schoellig, Angela P. .
ANNUAL REVIEW OF CONTROL ROBOTICS AND AUTONOMOUS SYSTEMS, 2022, 5 :411-444
[4]   Modular Deep Reinforcement Learning for Continuous Motion Planning With Temporal Logic [J].
Cai, Mingyu ;
Hasanbeig, Mohammadhosein ;
Xiao, Shaoping ;
Abate, Alessandro ;
Kan, Zhen .
IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (04) :7973-7980
[5]   Robotic space exploration agents [J].
Chien, Steve ;
Wagstaff, Kiri L. .
SCIENCE ROBOTICS, 2017, 2 (07)
[6]  
Dai AN, 2020, IEEE INT CONF ROBOT, P9570, DOI [10.1109/ICRA40945.2020.9196707, 10.1109/icra40945.2020.9196707]
[7]   Camera view planning based on generative adversarial imitation learning in indoor active exploration [J].
Dai, Xu-Yang ;
Meng, Qing-Hao ;
Jin, Sheng ;
Liu, Yin -Bo .
APPLIED SOFT COMPUTING, 2022, 129
[8]   The current state and future outlook of rescue robotics [J].
Delmerico, Jeffrey ;
Mintchev, Stefano ;
Giusti, Alessandro ;
Gromov, Boris ;
Melo, Kamilo ;
Horvat, Tomislav ;
Cadena, Cesar ;
Hutter, Marco ;
Ijspeert, Auke ;
Floreano, Dario ;
Gambardella, Luca M. ;
Siegwart, Roland ;
Scaramuzza, Davide .
JOURNAL OF FIELD ROBOTICS, 2019, 36 (07) :1171-1191
[9]  
Deng D, 2020, IEEE ASME INT C ADV, P1497, DOI 10.1109/AIM43001.2020.9158881
[10]   On Benchmarking of Frontier-Based Multi-Robot Exploration Strategies [J].
Faigl, Jan ;
Kulich, Miroslav .
2015 EUROPEAN CONFERENCE ON MOBILE ROBOTS (ECMR), 2015,