2D LiDAR Based Reinforcement Learning for Multi-Target Path Planning in Unknown Environment

被引：4

作者：

Abdalmanan, Nasr ^{[1
]}

Kamarudin, Kamarulzaman ^{[1
,2
]}

Abu Bakar, Muhammad Aizat ^{[1
]}

Rahiman, Mohd Hafiz Fazalul ^{[1
]}

Zakaria, Ammar ^{[1
,2
]}

Mamduh, Syed Muhammad ^{[2
]}

Kamarudin, Latifah Munirah ^{[2
]}

机构：

[1] Univ Malaysia Perlis, Fac Elect Engn & Technol, Arau 02600, Malaysia

[2] Univ Malaysia Perlis, Ctr Excellence Adv Sensor Technol CEASTech, Arau 02600, Malaysia

来源：

IEEE ACCESS | 2023年 / 11卷

关键词：

Path planning; Q-learning; Robot sensing systems; Training; Mobile robots; Convergence; RL; path planning; mobile robot; GENETIC ALGORITHM; NAVIGATION;

D O I：

10.1109/ACCESS.2023.3265207

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Global path planning techniques have been widely employed in solving path planning problems, however they have been found to be unsuitable for unknown environments. Contrarily, the traditional Q-learning method, which is a common reinforcement learning approach for local path planning, is unable to complete the task for multiple targets. To address these limitations, this paper proposes a modified Q-learning method, called Vector Field Histogram based Q-learning (VFH-QL) utilized the VFH information in state space representation and reward function, based on a 2D LiDAR sensor. We compared the performance of our proposed method with the classical Q-learning method (CQL) through training experiments that were conducted in a simulated environment with a size of 400 square pixels, representing a 20-meter square map. The environment contained static obstacles and a single mobile robot. Two experiments were conducted: experiment A involved path planning for a single target, while experiment B involved path planning for multiple targets. The results of experiment A showed that VFH-QL method had 87.06% less training time and 99.98% better obstacle avoidance compared to CQL. In experiment B, VFH-QL method was found to have an average training time that was 95.69% less than that of the CQL method and 83.99% better path quality. The VFH-QL method was then evaluated using a benchmark dataset. The results indicated that the VFH-QL exhibited superior path quality, with efficiency of 94.89% and improvements of 96.91% and 96.69% over CQL and SARSA in the task of path planning for multiple targets in unknown environments.

引用

页码：35541 / 35555

页数：15

共 38 条

[1] Grid-Based Mobile Robot Path Planning Using Aging-Based Ant Colony Optimization Algorithm in Static and Dynamic Environments [J].

Ajeil, Fatin Hassan ;

Ibraheem, Ibraheem Kasim ;

Azar, Ahmad Taher ;

Humaidi, Amjad J. .

SENSORS, 2020, 20 (07)

[2]

Babu V. M., 2016, P 2016 10 INT C INTE, P1, DOI 10.1109/ISCO.2016.7727034

[3]

Campbell S, 2020, 2020 6TH INTERNATIONAL CONFERENCE ON MECHATRONICS AND ROBOTICS ENGINEERING, ICMRE, P12, DOI 10.1109/ICMRE49073.2020.9065187

[4]

Cardona G. A., 2019, IEEE SOUTHEASTCON, P1, DOI 10.1109/SoutheastCon42311.2019.9020521

[5] A knowledge-free path planning approach for smart ships based on reinforcement learning [J].

Chen, Chen ;

Chen, Xian-Qiao ;

Ma, Feng ;

Zeng, Xiao-Jun ;

Wang, Jin .

OCEAN ENGINEERING, 2019, 189

[6] Q-Learning: Theory and Applications [J].

Clifton, Jesse ;

Laber, Eric .

ANNUAL REVIEW OF STATISTICS AND ITS APPLICATION, VOL 7, 2020, 2020, 7 :279-301

[7]

Das P. K., 2012, INT J COMPUT APPL, V51, P40, DOI [10.5120/8073-1468, DOI 10.5120/8073-1468]

[8]

Dewantara BSB, 2016, 2016 INTERNATIONAL CONFERENCE ON KNOWLEDGE CREATION AND INTELLIGENT COMPUTING (KCIC), P88, DOI 10.1109/KCIC.2016.7883630

[9]

Dorigo M, 2010, INT SER OPER RES MAN, V146, P227, DOI 10.1007/978-1-4419-1665-5_8

[10]

Gao Haoran, 2022, Journal of Physics: Conference Series, V2294, DOI 10.1088/1742-6596/2294/1/012034

← 1 2 3 4 →