Adaptive Q-learning path planning algorithm based on virtual target guidance

被引:0
|
作者
Li Z. [1 ]
Hu X. [1 ]
Zhang Y. [1 ]
Xu J. [1 ]
机构
[1] School of Electrical Engineering and Automation, Anhui University, Hefei
来源
Jisuanji Jicheng Zhizao Xitong/Computer Integrated Manufacturing Systems, CIMS | 2024年 / 30卷 / 02期
基金
中国国家自然科学基金;
关键词
mobile robots; path planning; Q-learning; reinforcement learning;
D O I
10.13196/j.cims.2022.0733
中图分类号
学科分类号
摘要
When the classical reinforcement learning algorithm is used for robot path planning in unknown environments, there are problems such as low exploration efficiency, slow convergence speed, easy to fall into terrain traps, and lack of intermediate states in the learning process, resulting in blindness in exploration. To solve the above problems, a dual memory mechanism, a virtual target guidance method and an adaptive greedy factor were designed, and an adaptive Q-Learning algorithm based on Virtual Target Guidance (VTGA-Q-Learning) was proposed. To verify the performance of the new algorithm, four kinds of environment maps were designed, and the simulation experiments were compared with other improved algorithms. Furthermore, a virtual simulation experiment of the four-wheel drive McNum wheel robot was carried out to simulate the real environment and verify the performance of the algorithm. Experimental results showed that the proposed new algorithm significantly reduced the number of iterations, improved the convergence speed of reinforcement learning, and had good robustness to complex environments, which could effectively avoid terrain traps, improve the performance of mobile robot navigation system and provided a reference for mobile robot autonomous path planning. © 2024 CIMS. All rights reserved.
引用
收藏
页码:553 / 568
页数:15
相关论文
共 40 条
  • [1] LOW E S., ONG P, CHEAH K C., Solving the optimal path planning of a mohile rohot using improved Q-learning[J], Robotics and Autonomous Systems, 115, pp. 143-161, (2019)
  • [2] KLIDBARY S H, SHOURAKI S B, KOURABBASLOU S S., Path planning of modular rohots on various terrains using Q-learning versus optimization algorithms, Intelligent Service Robotics, 10, 2, pp. 121-136, (2017)
  • [3] OROZCO-ROSAS U, SEPULVEDA R., Mobile robot path planning using membrane evolutionary artificial potential field, Applied Soft Computing, 77, pp. 236-251, (2019)
  • [4] SONG Q, ZHAO Q L, WANG S X, Et al., Dynamic path planning for unmanned vehicles based on fuzzy logic and improved ant colony optimization [ J ], IEEE Access, 8, pp. 62107-62115, (2020)
  • [5] HENTOUTA A., Optimal path planning approach based on Q-learning algorithm for mobile robots, Applied Soft Computing, 97, (2020)
  • [6] POLYDOROS A S, NALPANTIDIS L., Survey of model-based reinforcement learning: Applications on robotics [J], Journal of Intelligent &Robotic Systems, 86, 2, pp. 153-173, (2017)
  • [7] CHANG L, SHAN L, JIANG C, Et al., Reinforcement based mobile robot path planning with improved dynamic window approach in unknown environment, Autonomous Robots, 45, 1, pp. 51-76, (2021)
  • [8] DUGULEANA M, MOGAN G., Neural networks based reinforcement learning for mobile robots obstacle avoidance, Expert Systems with Applications, 62, pp. 104-115, (2016)
  • [9] KLIDBARY S H, SHOURAKI S B, KOURABBASLOU S S., Path planning of modular robots on various terrains using Q-learning versus optimization algorithms, Intelligent Service Robotics, 10, 2, pp. 121-136, (2017)
  • [10] LI S D, XU X, ZUO L., Dynamic path planning of a mobile robot with improved Q-learning algorithm, Proceedings of IEEE International Conference on Information and Automation, (2015)