Path planning for intelligent robots based on deep Q-learning with experience replay and heuristic knowledge

被引：118

作者：

Jiang, Lan ^{[1
]}

Huang, Hongyun ^{[2
]}

Ding, Zuohua ^{[1
]}

机构：

[1] Zhejiang Sci Tech Univ, Lab Intelligent Comp & Software Engn, Hangzhou 310018, Peoples R China

[2] Zhejiang Sci Tech Univ, Ctr Multimedia Big Data Lib, Hangzhou 310018, Peoples R China

来源：

IEEE-CAA JOURNAL OF AUTOMATICA SINICA | 2020年 / 7卷 / 04期

基金：

中国国家自然科学基金;

关键词：

Deep Q-learning (DQL); experience replay (ER); heuristic knowledge (HK); path planning; REINFORCEMENT; NAVIGATION;

D O I：

10.1109/JAS.2019.1911732

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Path planning and obstacle avoidance are two challenging problems in the study of intelligent robots. In this paper, we develop a new method to alleviate these problems based on deep Q-learning with experience replay and heuristic knowledge. In this method, a neural network has been used to resolve the "curse of dimensionality" issue of the Q-table in reinforcement learning. When a robot is walking in an unknown environment, it collects experience data which is used for training a neural network; such a process is called experience replay. Heuristic knowledge helps the robot avoid blind exploration and provides more effective data for training the neural network. The simulation results show that in comparison with the existing methods, our method can converge to an optimal action strategy with less time and can explore a path in an unknown environment with fewer steps and larger average reward.

引用

页码：1179 / 1189

页数：11

共 33 条

[1]

[Anonymous], ENERGIES, DOI DOI 10.1109/EC0C.2017.8346007

[2]

[Anonymous], 1993, THESIS

[3]

Babu V. M., 2016, P 2016 10 INT C INTE, P1, DOI 10.1109/ISCO.2016.7727034

[4] A new mobile robot navigation method using fuzzy logic and a modified Q-learning algorithm [J].

Boubertakh, H. ;

Tadjine, M. ;

Glorennec, P. -Y. .

JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2010, 21 (1-2) :113-119

[5]

Bounini F, 2017, IEEE INT VEH SYM, P180, DOI 10.1109/IVS.2017.7995717

[6]

Changjun Yu, 2009, Proceedings of the 2009 9th International Conference on Electronic Measurement & Instruments (ICEMI 2009), P1, DOI 10.1109/ICEMI.2009.5274773

[7]

Chen YT, 2016, ACM SIGPLAN NOTICES, V51, P85, DOI [10.1145/2908080.2908095, 10.1145/2980983.2908095]

[8]

Cherroun L., 2012, IEEE INT C INF TECHN, P1

[9]

Gao Yang, 2004, Acta Automatica Sinica, V30, P86

[10]

Huang BQ, 2005, PROCEEDINGS OF 2005 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-9, P85

← 1 2 3 4 →