A Guided-to-Autonomous Policy Learning method of Deep Reinforcement Learning in Path Planning

被引:0
|
作者
Zhao, Wang [1 ]
Zhang, Ye [1 ]
Li, Haoyu [1 ]
机构
[1] Northwestern Polytech Univ, Sch Astronaut, Xian, Peoples R China
来源
2024 IEEE 18TH INTERNATIONAL CONFERENCE ON CONTROL & AUTOMATION, ICCA 2024 | 2024年
基金
中国国家自然科学基金;
关键词
path planning; Deep Reinforcement Learning; training efficiency; composite optimization; Guided-to-Autonomous Policy Learning;
D O I
10.1109/ICCA62789.2024.10591821
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This study introduces a Guided-to-Autonomous Policy Learning (GAPL) method that improves the training efficiency and composite optimization of Deep Reinforcement Learning (DRL) in path planning. Under this method, firstly, we introduce the concept of guiding rewards as a reward enhancement mechanism, which, based on Rapidly-exploring Random Trees (RRT) and Artificial Potential Field (APF) algorithm, effectively addresses the challenge of training efficiency. We then propose the Guided-to-Autonomous Reward Transition (GART) model to solve the combined challenges of balancing training efficiency with composite optimization problems, which lies in the evolutionary refinement of the reward structure, initially dominated by guiding rewards, transiting progressively toward a focus on rewards that emphasize composite optimization, specifically minimizing the distance and time to the end point. Simulated experiments in static obstacle settings and mixed dynamic-static obstacle environments demonstrate that: 1) guiding rewards play a significant role in enhancing training efficiency; 2) the GAPL method yields superior composite optimization outcomes for path planning compared to conventional methods, and it effectively addresses the issue of training efficiency in conventional DRL method.
引用
收藏
页码:665 / 672
页数:8
相关论文
共 50 条
  • [41] Reinforcement Learning Path Planning Method with Error Estimation
    Zhang, Feihu
    Wang, Can
    Cheng, Chensheng
    Yang, Dianyu
    Pan, Guang
    ENERGIES, 2022, 15 (01)
  • [42] Grid Path Planning with Deep Reinforcement Learning: Preliminary Results
    Panov, Aleksandr, I
    Yakovlev, Konstantin S.
    Suvorov, Roman
    8TH ANNUAL INTERNATIONAL CONFERENCE ON BIOLOGICALLY INSPIRED COGNITIVE ARCHITECTURES, BICA 2017 (EIGHTH ANNUAL MEETING OF THE BICA SOCIETY), 2018, 123 : 347 - 353
  • [43] Application of Deep Reinforcement Learning in Mobile Robot Path Planning
    Xin, Jing
    Zhao, Huan
    Liu, Ding
    Li, Minqi
    2017 CHINESE AUTOMATION CONGRESS (CAC), 2017, : 7112 - 7116
  • [44] Crowd evacuation path planning and simulation method based on deep reinforcement learning and repulsive force field
    Wang, Hongyue
    Liu, Hong
    Li, Wenhao
    APPLIED INTELLIGENCE, 2025, 55 (04)
  • [45] UCAV Path Planning Algorithm Based on Deep Reinforcement Learning
    Zheng, Kaiyuan
    Gao, Jingpeng
    Shen, Liangxi
    IMAGE AND GRAPHICS, ICIG 2019, PT II, 2019, 11902 : 702 - 714
  • [46] An Intelligent Path Planning Scheme of Autonomous Vehicles Platoon Using Deep Reinforcement Learning on Network Edge
    Chen, Chen
    Jiang, Jiange
    Lv, Ning
    Li, Siyu
    IEEE ACCESS, 2020, 8 : 99059 - 99069
  • [47] Path Planning for Mobile Robot Based on Deep Reinforcement Learning and Fuzzy Control
    Liu, Chunling
    Xu, Jun
    Guo, Kaiwen
    2022 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, COMPUTER VISION AND MACHINE LEARNING (ICICML), 2022, : 533 - 537
  • [48] Deep reinforcement learning-based path planning of underactuated surface vessels
    Xu H.
    Wang N.
    Zhao H.
    Zheng Z.
    Cyber-Physical Systems, 2019, 5 (01): : 1 - 17
  • [49] Path planning in an unknown environment based on deep reinforcement learning with prior knowledge
    Lou, Ping
    Xu, Kun
    Jiang, Xuemei
    Xiao, Zheng
    Yan, Junwei
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 41 (06) : 5773 - 5789
  • [50] A deep reinforcement learning approach incorporating genetic algorithm for missile path planning
    Shuangfei Xu
    Wenhao Bi
    An Zhang
    Yunong Wang
    International Journal of Machine Learning and Cybernetics, 2024, 15 : 1795 - 1814