Improved Robot Path Planning Method Based on Deep Reinforcement Learning

被引：14

作者：

Han, Huiyan ^{[1
,2
,3
]}

Wang, Jiaqi ^{[1
,2
,3
]}

Kuang, Liqun ^{[1
,2
,3
]}

Han, Xie ^{[1
,2
,3
]}

Xue, Hongxin ^{[1
,2
,3
]}

机构：

[1] North Univ China, Sch Comp Sci & Technol, Taiyuan 030051, Peoples R China

[2] Shanxi Key Lab Machine Vis & Virtual Real, Taiyuan 030051, Peoples R China

[3] Shanxi Vis Informat Proc & Intelligent Robot Engn, Taiyuan 030051, Peoples R China

来源：

SENSORS | 2023年 / 23卷 / 12期

基金：

中国国家自然科学基金;

关键词：

robot path planning; deep reinforcement learning; DDQN; expert experience;

D O I：

10.3390/s23125622

中图分类号：

O65 [分析化学];

学科分类号：

070302 ; 081704 ;

摘要：

With the advancement of robotics, the field of path planning is currently experiencing a period of prosperity. Researchers strive to address this nonlinear problem and have achieved remarkable results through the implementation of the Deep Reinforcement Learning (DRL) algorithm DQN (Deep Q-Network). However, persistent challenges remain, including the curse of dimensionality, difficulties of model convergence and sparsity in rewards. To tackle these problems, this paper proposes an enhanced DDQN (Double DQN) path planning approach, in which the information after dimensionality reduction is fed into a two-branch network that incorporates expert knowledge and an optimized reward function to guide the training process. The data generated during the training phase are initially discretized into corresponding low-dimensional spaces. An "expert experience" module is introduced to facilitate the model's early-stage training acceleration in the Epsilon-Greedy algorithm. To tackle navigation and obstacle avoidance separately, a dual-branch network structure is presented. We further optimize the reward function enabling intelligent agents to receive prompt feedback from the environment after performing each action. Experiments conducted in both virtual and real-world environments have demonstrated that the enhanced algorithm can accelerate model convergence, improve training stability and generate a smooth, shorter and collision-free path.

引用

页数：23

共 28 条

[1] SPSD: Semantics and Deep Reinforcement Learning Based Motion Planning for Supermarket Robot [J].

Cai, Jialun ;

Huang, Weibo ;

You, Yingxuan ;

Chen, Zhan ;

Ren, Bin ;

Liu, Hong .

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2023, E106D (05) :765-772

[2] Overcoming Exploration: Deep Reinforcement Learning for Continuous Control in Cluttered Environments From Temporal Logic Specifications [J].

Cai, Mingyu ;

Aasi, Erfan ;

Belta, Calin ;

Vasile, Cristian-Ioan .

IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (04) :2158-2165

[3] SLP-Improved DDPG Path-Planning Algorithm for Mobile Robot in Large-Scale Dynamic Environment [J].

Chen, Yinliang ;

Liang, Liang .

SENSORS, 2023, 23 (07)

[4] Neural networks based reinforcement learning for mobile robots obstacle avoidance [J].

Duguleana, Mihai ;

Mogan, Gheorghe .

EXPERT SYSTEMS WITH APPLICATIONS, 2016, 62 :104-115

[5] First return, then explore [J].

Ecoffet, Adrien ;

Huizinga, Joost ;

Lehman, Joel ;

Stanley, Kenneth O. ;

Clune, Jeff .

NATURE, 2021, 590 (7847) :580-586

[6]

Florensa C, 2017, PR MACH LEARN RES, V78

[7] Deep Reinforcement Learning for Indoor Mobile Robot Path Planning [J].

Gao, Junli ;

Ye, Weijie ;

Guo, Jing ;

Li, Zhongjuan .

SENSORS, 2020, 20 (19) :1-15

[8] Signal Novelty Detection as an Intrinsic Reward for Robotics [J].

Kubovcik, Martin ;

Luptakova, Iveta Dirgova ;

Pospichal, Jiri .

SENSORS, 2023, 23 (08)

[9]

Liu Y, 2019, Arxiv, DOI arXiv:1905.13420

[10] Approximate Optimal Stabilization Control of Servo Mechanisms based on Reinforcement Learning Scheme [J].

Lv, Yongfeng ;

Ren, Xuemei ;

Hu, Shuangyi ;

Xu, Hao .

INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2019, 17 (10) :2655-2665

← 1 2 3 →