2D LiDAR Based Reinforcement Learning for Multi-Target Path Planning in Unknown Environment

被引:4
作者
Abdalmanan, Nasr [1 ]
Kamarudin, Kamarulzaman [1 ,2 ]
Abu Bakar, Muhammad Aizat [1 ]
Rahiman, Mohd Hafiz Fazalul [1 ]
Zakaria, Ammar [1 ,2 ]
Mamduh, Syed Muhammad [2 ]
Kamarudin, Latifah Munirah [2 ]
机构
[1] Univ Malaysia Perlis, Fac Elect Engn & Technol, Arau 02600, Malaysia
[2] Univ Malaysia Perlis, Ctr Excellence Adv Sensor Technol CEASTech, Arau 02600, Malaysia
关键词
Path planning; Q-learning; Robot sensing systems; Training; Mobile robots; Convergence; RL; path planning; mobile robot; GENETIC ALGORITHM; NAVIGATION;
D O I
10.1109/ACCESS.2023.3265207
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Global path planning techniques have been widely employed in solving path planning problems, however they have been found to be unsuitable for unknown environments. Contrarily, the traditional Q-learning method, which is a common reinforcement learning approach for local path planning, is unable to complete the task for multiple targets. To address these limitations, this paper proposes a modified Q-learning method, called Vector Field Histogram based Q-learning (VFH-QL) utilized the VFH information in state space representation and reward function, based on a 2D LiDAR sensor. We compared the performance of our proposed method with the classical Q-learning method (CQL) through training experiments that were conducted in a simulated environment with a size of 400 square pixels, representing a 20-meter square map. The environment contained static obstacles and a single mobile robot. Two experiments were conducted: experiment A involved path planning for a single target, while experiment B involved path planning for multiple targets. The results of experiment A showed that VFH-QL method had 87.06% less training time and 99.98% better obstacle avoidance compared to CQL. In experiment B, VFH-QL method was found to have an average training time that was 95.69% less than that of the CQL method and 83.99% better path quality. The VFH-QL method was then evaluated using a benchmark dataset. The results indicated that the VFH-QL exhibited superior path quality, with efficiency of 94.89% and improvements of 96.91% and 96.69% over CQL and SARSA in the task of path planning for multiple targets in unknown environments.
引用
收藏
页码:35541 / 35555
页数:15
相关论文
共 38 条
[1]   Grid-Based Mobile Robot Path Planning Using Aging-Based Ant Colony Optimization Algorithm in Static and Dynamic Environments [J].
Ajeil, Fatin Hassan ;
Ibraheem, Ibraheem Kasim ;
Azar, Ahmad Taher ;
Humaidi, Amjad J. .
SENSORS, 2020, 20 (07)
[2]  
Babu V. M., 2016, P 2016 10 INT C INTE, P1, DOI 10.1109/ISCO.2016.7727034
[3]  
Campbell S, 2020, 2020 6TH INTERNATIONAL CONFERENCE ON MECHATRONICS AND ROBOTICS ENGINEERING, ICMRE, P12, DOI 10.1109/ICMRE49073.2020.9065187
[4]  
Cardona G. A., 2019, IEEE SOUTHEASTCON, P1, DOI 10.1109/SoutheastCon42311.2019.9020521
[5]   A knowledge-free path planning approach for smart ships based on reinforcement learning [J].
Chen, Chen ;
Chen, Xian-Qiao ;
Ma, Feng ;
Zeng, Xiao-Jun ;
Wang, Jin .
OCEAN ENGINEERING, 2019, 189
[6]   Q-Learning: Theory and Applications [J].
Clifton, Jesse ;
Laber, Eric .
ANNUAL REVIEW OF STATISTICS AND ITS APPLICATION, VOL 7, 2020, 2020, 7 :279-301
[7]  
Das P. K., 2012, INT J COMPUT APPL, V51, P40, DOI [10.5120/8073-1468, DOI 10.5120/8073-1468]
[8]  
Dewantara BSB, 2016, 2016 INTERNATIONAL CONFERENCE ON KNOWLEDGE CREATION AND INTELLIGENT COMPUTING (KCIC), P88, DOI 10.1109/KCIC.2016.7883630
[9]  
Dorigo M, 2010, INT SER OPER RES MAN, V146, P227, DOI 10.1007/978-1-4419-1665-5_8
[10]  
Gao Haoran, 2022, Journal of Physics: Conference Series, V2294, DOI 10.1088/1742-6596/2294/1/012034