Optimized-Weighted-Speedy Q-Learning Algorithm for Multi-UGV in Static Environment Path Planning under Anti-Collision Cooperation Mechanism

被引：5

作者：

Cao, Yuanying ^{[1
]}

Fang, Xi ^{[1
]}

机构：

[1] Wuhan Univ Technol, Sch Sci, Wuhan 430070, Peoples R China

来源：

MATHEMATICS | 2023年 / 11卷 / 11期

关键词：

optimized-weighted-speedy Q-learning algorithm; path planning; anti-collision cooperation mechanism; reinforcement learning; unmanned ground vehicle (UGV);

D O I：

10.3390/math11112476

中图分类号：

O1 [数学];

学科分类号：

0701 ; 070101 ;

摘要：

With the accelerated development of smart cities, the concept of a "smart industrial park" in which unmanned ground vehicles (UGVs) have wide application has entered the industrial field of vision. When faced with multiple tasks and heterogeneous tasks, the task execution efficiency of a single UGV is inefficient, thus the task planning research under multi-UGV cooperation has become more urgent. In this paper, under the anti-collision cooperation mechanism for multi-UGV path planning, an improved algorithm with optimized-weighted-speedy Q-learning (OWS Q-learning) is proposed. The slow convergence speed of the Q-learning algorithm is overcome to a certain extent by changing the update mode of the Q function. By improving the selection mode of learning rate and the selection strategy of action, the relationship between exploration and utilization is balanced, and the learning efficiency of multi-agent in complex environments is improved. The simulation experiments in static environment show that the designed anti-collision coordination mechanism effectively solves the coordination problem of multiple UGVs in the same scenario. In the same experimental scenario, compared with the Q-learning algorithm and other reinforcement learning algorithms, only the OWS Q-learning algorithm achieves the convergence effect, and the OWS Q-learning algorithm has the shortest collision-free path for UGVS and the least time to complete the planning. Compared with the Q-learning algorithm, the calculation time of the OWS Q-learning algorithm in the three experimental scenarios is improved by 53.93%, 67.21%, and 53.53%, respectively. This effectively improves the intelligent development of UGV in smart parks.

引用

页数：28

共 62 条

[1]

Alotaibi ETS, 2016, INT J ADV COMPUT SC, V7, P179

[2] Modeling the smartness or smart development levels of developing countries' cities [J].

Antwi-Afari, Prince ;

Owusu-Manu, De-Graft ;

Ng, S. Thomas ;

Asumadu, George .

JOURNAL OF URBAN MANAGEMENT, 2021, 10 (04) :369-381

[3]

Azar M.G., 2011, ADV NEURAL INF PROCE, V2011, P2411

[4] Multi-Robot Path Planning Method Using Reinforcement Learning [J].

Bae, Hyansu ;

Kim, Gidong ;

Kim, Jonguk ;

Qian, Dianwei ;

Lee, Sukgyu .

APPLIED SCIENCES-BASEL, 2019, 9 (15)

[5]

Chen L., 2019, THESIS BEIJING JIAOT

[6] A deep reinforcement learning based method for real-time path planning and dynamic obstacle avoidance [J].

Chen, Pengzhan ;

Pei, Jiean ;

Lu, Weiqing ;

Li, Mingzhen .

NEUROCOMPUTING, 2022, 497 :64-75

[7] Path Planning for Vehicle-borne System Consisting of Multi Air-ground Robots [J].

Chen, Yang ;

Ren, Shiwen ;

Chen, Zhihuan ;

Chen, Mengqing ;

Wu, Huaiyu .

ROBOTICA, 2020, 38 (03) :493-511

[8]

Chu J., 2022, CHINAS IND INFORM, V28, P40, DOI [10.19609/j.cnki.cn10-1299/f.2022.04.010, DOI 10.19609/J.CNKI.CN10-1299/F.2022.04.010]

[9]

Cornish Hellaby Watkins ChristopherJohn., 1989, LEARNING DELAYED REW

[10] Multi-robot path planning using improved particle swarm optimization algorithm through novel evolutionary operators [J].

Das, P. K. ;

Jena, P. K. .

APPLIED SOFT COMPUTING, 2020, 92

← 1 2 3 4 5 6 7 →