A Guided-to-Autonomous Policy Learning method of Deep Reinforcement Learning in Path Planning

被引:0
|
作者
Zhao, Wang [1 ]
Zhang, Ye [1 ]
Li, Haoyu [1 ]
机构
[1] Northwestern Polytech Univ, Sch Astronaut, Xian, Peoples R China
来源
2024 IEEE 18TH INTERNATIONAL CONFERENCE ON CONTROL & AUTOMATION, ICCA 2024 | 2024年
基金
中国国家自然科学基金;
关键词
path planning; Deep Reinforcement Learning; training efficiency; composite optimization; Guided-to-Autonomous Policy Learning;
D O I
10.1109/ICCA62789.2024.10591821
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This study introduces a Guided-to-Autonomous Policy Learning (GAPL) method that improves the training efficiency and composite optimization of Deep Reinforcement Learning (DRL) in path planning. Under this method, firstly, we introduce the concept of guiding rewards as a reward enhancement mechanism, which, based on Rapidly-exploring Random Trees (RRT) and Artificial Potential Field (APF) algorithm, effectively addresses the challenge of training efficiency. We then propose the Guided-to-Autonomous Reward Transition (GART) model to solve the combined challenges of balancing training efficiency with composite optimization problems, which lies in the evolutionary refinement of the reward structure, initially dominated by guiding rewards, transiting progressively toward a focus on rewards that emphasize composite optimization, specifically minimizing the distance and time to the end point. Simulated experiments in static obstacle settings and mixed dynamic-static obstacle environments demonstrate that: 1) guiding rewards play a significant role in enhancing training efficiency; 2) the GAPL method yields superior composite optimization outcomes for path planning compared to conventional methods, and it effectively addresses the issue of training efficiency in conventional DRL method.
引用
收藏
页码:665 / 672
页数:8
相关论文
共 50 条
  • [1] EPPE: An Efficient Progressive Policy Enhancement framework of deep reinforcement learning in path planning
    Zhao, Wang
    Zhang, Ye
    Xie, Zikang
    NEUROCOMPUTING, 2024, 596
  • [2] A UAV Path Planning Method Based on Deep Reinforcement Learning
    Li, Yibing
    Zhang, Sitong
    Ye, Fang
    Jiang, Tao
    Li, Yingsong
    2020 IEEE USNC-CNC-URSI NORTH AMERICAN RADIO SCIENCE MEETING (JOINT WITH AP-S SYMPOSIUM), 2020, : 93 - 94
  • [3] A path planning method based on deep reinforcement learning for crowd evacuation
    Meng X.
    Liu H.
    Li W.
    Journal of Ambient Intelligence and Humanized Computing, 2024, 15 (6) : 2925 - 2939
  • [4] Autonomous Quadrotor Path Planning Through Deep Reinforcement Learning With Monocular Depth Estimation
    Khojasteh, Mahdi Shahbazi
    Salimi-Badr, Armin
    IEEE OPEN JOURNAL OF VEHICULAR TECHNOLOGY, 2025, 6 : 34 - 51
  • [5] Path Planning for Autonomous Vehicles in Unknown Dynamic Environment Based on Deep Reinforcement Learning
    Hu, Hui
    Wang, Yuge
    Tong, Wenjie
    Zhao, Jiao
    Gu, Yulei
    APPLIED SCIENCES-BASEL, 2023, 13 (18):
  • [6] Adaptive Path Planning for Autonomous Ships Based on Deep Reinforcement Learning Combined with Images
    Zheng, Kangjie
    Zhang, Xinyu
    Wang, Chengbo
    Cui, Hao
    Wang, Leihao
    PROCEEDINGS OF 2022 INTERNATIONAL CONFERENCE ON AUTONOMOUS UNMANNED SYSTEMS, ICAUS 2022, 2023, 1010 : 1706 - 1715
  • [7] An Autonomous Path Planning Model for Unmanned Ships Based on Deep Reinforcement Learning
    Guo, Siyu
    Zhang, Xiuguo
    Zheng, Yisong
    Du, Yiquan
    SENSORS, 2020, 20 (02)
  • [8] Deep reinforcement learning for path planning of autonomous mobile robots in complicated environments
    Zhijie Zhang
    Hao Fu
    Juan Yang
    Yunhan Lin
    Complex & Intelligent Systems, 2025, 11 (6)
  • [9] Mobile Robot Path Planning Method Based on Deep Reinforcement Learning Algorithm
    Meng, Haitao
    Zhang, Hengrui
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2022, 31 (15)
  • [10] Robot Search Path Planning Method Based on Prioritized Deep Reinforcement Learning
    Yanglong Liu
    Zuguo Chen
    Yonggang Li
    Ming Lu
    Chaoyang Chen
    Xuzhuo Zhang
    International Journal of Control, Automation and Systems, 2022, 20 : 2669 - 2680