Optimal path planning approach based on Q-learning algorithm for mobile robots

被引：87

作者：

Maoudj, Abderraouf ^{[1
]}

Hentout, Abdelfetah ^{[1
]}

机构：

[1] Ctr Dev Technol Avancees CDTA, Div Prod & Robot DPR, BP 17, Algiers 16303, Algeria

来源：

APPLIED SOFT COMPUTING | 2020年 / 97卷

关键词：

Path optimization; Efficient Q-Learning; Efficient selection strategy; Convergence speed; Training performances; GENETIC ALGORITHM;

D O I：

10.1016/j.asoc.2020.106796

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In fact, optimizing path within short computation time still remains a major challenge for mobile robotics applications. In path planning and obstacles avoidance, Q-Learning (QL) algorithm has been widely used as a computational method of learning through environment interaction. However, less emphasis is placed on path optimization using QL because of its slow and weak convergence toward optimal solutions. Therefore, this paper proposes an Efficient Q-Learning (EQL) algorithm to overcome these limitations and ensure an optimal collision-free path in less possible time. In the QL algorithm, successful learning is closely dependent on the design of an effective reward function and an efficient selection strategy for an optimal action that ensures exploration and exploitation. In this regard, a new reward function is proposed to initialize the Q-table and provide the robot with prior knowledge of the environment, followed by a new efficient selection strategy proposal to accelerate the learning process through search space reduction while ensuring a rapid convergence toward an optimized solution. The main idea is to intensify research at each learning stage, around the straight-line segment linking the current position of the robot to Target (optimal path in terms of length). During the learning process, the proposed strategy favors promising actions that not only lead to an optimized path but also accelerate the convergence of the learning process. The proposed EQL algorithm is first validated using benchmarks from the literature, followed by a comparison with other existing QL-based algorithms. The achieved results showed that the proposed EQL gained good learning proficiency; besides, the training performance is significantly improved compared to the state-of-the-art. Concluded, EQL improves the quality of the paths in terms of length, computation time and robot safety, furthermore outperforms other optimization algorithms. (C) 2020 Elsevier B.V. All rights reserved.

引用

页数：15

共 28 条

[1] Multi-objective path planning of an autonomous mobile robot using hybrid PSO-MFB optimization algorithm [J].

Ajeil, Fatin H. ;

Ibraheem, Ibraheem Kasim ;

Sahib, Mouayad A. ;

Humaidi, Amjad J. .

APPLIED SOFT COMPUTING, 2020, 89

[2] Optimized RRT-A* Path Planning Method for Mobile Robots in Partially Known Environment [J].

Ayawli, Ben Beklisi Kwame ;

Mei, Xue ;

Shen, Mouquan ;

Appiah, Albert Yaw ;

Kyeremeh, Frimpong .

INFORMATION TECHNOLOGY AND CONTROL, 2019, 48 (02) :179-194

[3] Optimal path planning and execution for mobile robots using genetic algorithm and adaptive fuzzy-logic control [J].

Bakdi, Azzeddine ;

Hentout, Abdelfetah ;

Boutami, Hakim ;

Maoudj, Abderraouf ;

Hachour, Ouarda ;

Bouzouia, Brahim .

ROBOTICS AND AUTONOMOUS SYSTEMS, 2017, 89 :95-109

[4]

Brand M., 2010, 2010 International Conference on Computer Design and Applications (ICCDA 2010), P436, DOI 10.1109/ICCDA.2010.5541300

[5]

Cheng Y.-H., 2019, IAENG Int. J. Comput. Sci., V46

[6] Theta*: Any-Angle Path Planning on Grids [J].

Daniel, Kenny ;

Nash, Alex ;

Koenig, Sven ;

Felner, Ariel .

JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2010, 39 :533-579

[7]

Das P. K., 2012, INT J COMPUT APPL, V51, P40, DOI [10.5120/8073-1468, DOI 10.5120/8073-1468]

[8]

Dorigo M, 2010, INT SER OPER RES MAN, V146, P227, DOI 10.1007/978-1-4419-1665-5_8

[9] Neural Q-Learning Controller for Mobile Robot [J].

Ganapathy, Velappa ;

Yun, Soh Chin ;

Joe, Halim Kusama .

2009 IEEE/ASME INTERNATIONAL CONFERENCE ON ADVANCED INTELLIGENT MECHATRONICS, VOLS 1-3, 2009, :863-868

[10]

Goswami I, 2010, LECT NOTES COMPUT SC, V6457, P379, DOI 10.1007/978-3-642-17298-4_40

← 1 2 3 →