Robot Dynamic Path Planning Based on Prioritized Experience Replay and LSTM Network

被引:0
|
作者
Li, Hongqi [1 ]
Zhong, Peisi [1 ]
Liu, Li [2 ]
Wang, Xiao [1 ]
Liu, Mei [3 ]
Yuan, Jie [1 ]
机构
[1] Shandong Univ Sci & Technol, Coll Mech & Elect Engn, Qingdao 266590, Peoples R China
[2] Shandong Univ Sci & Technol, Coll Comp Sci & Engn, Qingdao 266590, Peoples R China
[3] Shandong Univ Sci & Technol, Coll Energy Storage Technol, Qingdao 266590, Peoples R China
来源
IEEE ACCESS | 2025年 / 13卷
基金
中国国家自然科学基金;
关键词
Heuristic algorithms; Long short term memory; Path planning; Convergence; Robots; Training; Planning; Adaptation models; Accuracy; Deep reinforcement learning; DDQN; LSTM network; mobile robot; path planning; prioritized experience replay; LEARNING ALGORITHM;
D O I
10.1109/ACCESS.2025.3532449
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To address the issues of slow convergence speed, poor dynamic adaptability, and path redundancy in the Double Deep Q Network (DDQN) within complex obstacle environments, this paper proposes an enhanced algorithm within the deep reinforcement learning framework. This algorithm, termed LPDDQN, integrates Prioritized Experience Replay (PER) and the Long Short Term Memory (LSTM) network to improve upon the DDQN algorithm. First, Prioritized Experience Replay (PER) is utilized to prioritize experience data and optimize storage and sampling operations through the SumTree structure, rather than the conventional experience queue. Second, the LSTM network is introduced to enhance the dynamic adaptability of the DDQN algorithm. Owing to the introduction of the LSTM model, the experience samples must be sliced and populated. The performance of the proposed LPDDQN algorithm is compared with five other path planning algorithms in both static and dynamic environments. Simulation analysis shows that in a static environment, LPDDQN demonstrates significant improvements over traditional DDQN in terms of convergence, number of moving steps, success rate, and number of turns, with respective improvements of 24.07%, 17.49%, 37.73%, and 61.54%. In dynamic and complex environments, the success rates of all algorithms, except TLD3 and the LPDDQN, decreased significantly. Further analysis reveals that the LPDDQN outperforms the TLD3 by 18.87%, 2.41%, and 39.02% in terms of moving steps, success rate, and number of turns, respectively.
引用
收藏
页码:22283 / 22299
页数:17
相关论文
共 50 条
  • [1] Path Planning of Mobile Robot in Dynamic Obstacle Avoidance Environment Based on Deep Reinforcement Learning
    Zhang, Qingfeng
    Ma, Wenpeng
    Zheng, Qingchun
    Zhai, Xiaofan
    Zhang, Wenqian
    Zhang, Tianchang
    Wang, Shuo
    IEEE ACCESS, 2024, 12 : 189136 - 189152
  • [2] UAV Path Planning Based on the Average TD3 Algorithm With Prioritized Experience Replay
    Luo, Xuqiong
    Wang, Qiyuan
    Gong, Hongfang
    Tang, Chao
    IEEE ACCESS, 2024, 12 : 38017 - 38029
  • [3] Dynamic path planning via Dueling Double Deep Q-Network (D3QN) with prioritized experience replay
    Gok, Mehmet
    APPLIED SOFT COMPUTING, 2024, 158
  • [4] An Improved Dueling Deep Double-Q Network Based on Prioritized Experience Replay for Path Planning of Unmanned Surface Vehicles
    Zhu, Zhengwei
    Hu, Can
    Zhu, Chenyang
    Zhu, Yanping
    Sheng, Yu
    JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2021, 9 (11)
  • [5] A Path Planning Method Based on Deep Reinforcement Learning with Improved Prioritized Experience Replay for Human-Robot Collaboration
    Sun, Deyu
    Wen, Jingqian
    Wang, Jingfei
    Yang, Xiaonan
    Hu, Yaoguang
    HUMAN-COMPUTER INTERACTION, PT II, HCI 2024, 2024, 14685 : 196 - 206
  • [6] Deep Deterministic Policy Gradient Based on Double Network Prioritized Experience Replay
    Kang, Chaohai
    Rong, Chuiting
    Ren, Weijian
    Huo, Fengcai
    Liu, Pengyun
    IEEE ACCESS, 2021, 9 : 60296 - 60308
  • [7] Path Planning of a Mobile Robot for a Dynamic Indoor Environment Based on an SAC-LSTM Algorithm
    Zhang, Yongchao
    Chen, Pengzhan
    SENSORS, 2023, 23 (24)
  • [8] Prioritized experience replay in path planning via multi-dimensional transition priority fusion
    Cheng, Nuo
    Wang, Peng
    Zhang, Guangyuan
    Ni, Cui
    Nematov, Erkin
    FRONTIERS IN NEUROROBOTICS, 2023, 17
  • [9] Robot Search Path Planning Method Based on Prioritized Deep Reinforcement Learning
    Liu, Yanglong
    Chen, Zuguo
    Li, Yonggang
    Lu, Ming
    Chen, Chaoyang
    Zhang, Xuzhuo
    INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2022, 20 (08) : 2669 - 2680
  • [10] Robot Search Path Planning Method Based on Prioritized Deep Reinforcement Learning
    Yanglong Liu
    Zuguo Chen
    Yonggang Li
    Ming Lu
    Chaoyang Chen
    Xuzhuo Zhang
    International Journal of Control, Automation and Systems, 2022, 20 : 2669 - 2680