Path planning for intelligent robots based on deep Q-learning with experience replay and heuristic knowledge

被引:106
作者
Jiang, Lan [1 ]
Huang, Hongyun [2 ]
Ding, Zuohua [1 ]
机构
[1] Zhejiang Sci Tech Univ, Lab Intelligent Comp & Software Engn, Hangzhou 310018, Peoples R China
[2] Zhejiang Sci Tech Univ, Ctr Multimedia Big Data Lib, Hangzhou 310018, Peoples R China
基金
中国国家自然科学基金;
关键词
Deep Q-learning (DQL); experience replay (ER); heuristic knowledge (HK); path planning; REINFORCEMENT; NAVIGATION;
D O I
10.1109/JAS.2019.1911732
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Path planning and obstacle avoidance are two challenging problems in the study of intelligent robots. In this paper, we develop a new method to alleviate these problems based on deep Q-learning with experience replay and heuristic knowledge. In this method, a neural network has been used to resolve the "curse of dimensionality" issue of the Q-table in reinforcement learning. When a robot is walking in an unknown environment, it collects experience data which is used for training a neural network; such a process is called experience replay. Heuristic knowledge helps the robot avoid blind exploration and provides more effective data for training the neural network. The simulation results show that in comparison with the existing methods, our method can converge to an optimal action strategy with less time and can explore a path in an unknown environment with fewer steps and larger average reward.
引用
收藏
页码:1179 / 1189
页数:11
相关论文
共 33 条
  • [1] [Anonymous], 1993, THESIS
  • [2] Babu V., 2016, P 2016 10 INT C INTE, P1, DOI DOI 10.1109/ISCO.2016.7727034
  • [3] A new mobile robot navigation method using fuzzy logic and a modified Q-learning algorithm
    Boubertakh, H.
    Tadjine, M.
    Glorennec, P. -Y.
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2010, 21 (1-2) : 113 - 119
  • [4] Bounini F, 2017, IEEE INT VEH SYM, P180, DOI 10.1109/IVS.2017.7995717
  • [5] Changjun Yu, 2009, Proceedings of the 2009 9th International Conference on Electronic Measurement & Instruments (ICEMI 2009), P1, DOI 10.1109/ICEMI.2009.5274773
  • [6] Chen YT, 2016, ACM SIGPLAN NOTICES, V51, P85, DOI [10.1145/2980983.2908095, 10.1145/2908080.2908095]
  • [7] Cherroun L., 2012, IEEE INT C INF TECHN, P1
  • [8] Gao Yang, 2004, Acta Automatica Sinica, V30, P86
  • [9] Huang BQ, 2005, PROCEEDINGS OF 2005 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-9, P85
  • [10] Reinforcement based mobile robot navigation in dynamic environment
    Jaradat, Mohammad Abdel Kareem
    Al-Rousan, Mohammad
    Quadan, Lara
    [J]. ROBOTICS AND COMPUTER-INTEGRATED MANUFACTURING, 2011, 27 (01) : 135 - 149