Design and Experimental Validation of Deep Reinforcement Learning-Based Fast Trajectory Planning and Control for Mobile Robot in Unknown Environment

被引:167
作者
Chai, Runqi [1 ,2 ]
Niu, Hanlin [1 ]
Carrasco, Joaquin [1 ]
Arvin, Farshad [3 ]
Yin, Hujun [1 ]
Lennox, Barry [1 ]
机构
[1] Univ Manchester, Dept Elect & Elect Engn, Manchester M13 9PL, Lancs, England
[2] Beijing Inst Technol, Sch Automat, Beijing 100081, Peoples R China
[3] Univ Durham, Dept Comp Sci, Durham DH1 3LE, England
基金
英国工程与自然科学研究理事会;
关键词
Mobile robots; Trajectory; Planning; Collision avoidance; Training; Robot sensing systems; Noise measurement; Deep reinforcement learning (DRL); mobile robot; motion control; noisy prioritized experience replay (PER); optimal motion planning; recurrent neural network; unexpected obstacles; ROBUST; IMPLEMENTATION; VEHICLES; ASTERISK;
D O I
10.1109/TNNLS.2022.3209154
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article is concerned with the problem of planning optimal maneuver trajectories and guiding the mobile robot toward target positions in uncertain environments for exploration purposes. A hierarchical deep learning-based control framework is proposed which consists of an upper level motion planning layer and a lower level waypoint tracking layer. In the motion planning phase, a recurrent deep neural network (RDNN)-based algorithm is adopted to predict the optimal maneuver profiles for the mobile robot. This approach is built upon a recently proposed idea of using deep neural networks (DNNs) to approximate the optimal motion trajectories, which has been validated that a fast approximation performance can be achieved. To further enhance the network prediction performance, a recurrent network model capable of fully exploiting the inherent relationship between preoptimized system state and control pairs is advocated. In the lower level, a deep reinforcement learning (DRL)-based collision-free control algorithm is established to achieve the waypoint tracking task in an uncertain environment (e.g., the existence of unexpected obstacles). Since this approach allows the control policy to directly learn from human demonstration data, the time required by the training process can be significantly reduced. Moreover, a noisy prioritized experience replay (PER) algorithm is proposed to improve the exploring rate of control policy. The effectiveness of applying the proposed deep learning-based control is validated by executing a number of simulation and experimental case studies. The simulation result shows that the proposed DRL method outperforms the vanilla PER algorithm in terms of training speed. Experimental videos are also uploaded, and the corresponding results confirm that the proposed strategy is able to fulfill the autonomous exploration mission with improved motion planning performance, enhanced collision avoidance ability, and less training time.
引用
收藏
页码:5778 / 5792
页数:15
相关论文
共 38 条
[1]   Design and Implementation of Deep Neural Network-Based Control for Automatic Parking Maneuver Process [J].
Chai, Runqi ;
Tsourdos, Antonios ;
Savvaris, Al ;
Chai, Senchun ;
Xia, Yuanqing ;
Chen, C. L. Philip .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (04) :1400-1413
[2]   Six-DOF Spacecraft Optimal Trajectory Planning and Real-Time Attitude Control: A Deep Neural Network-Based Approach [J].
Chai, Runqi ;
Tsourdos, Antonios ;
Savvaris, Al ;
Chai, Senchun ;
Xia, Yuanqing ;
Chen, C. L. Philip .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (11) :5005-5013
[3]   Two-Stage Trajectory Optimization for Autonomous Ground Vehicles Parking Maneuver [J].
Chai, Runqi ;
Tsourdos, Antonios ;
Savvaris, Al ;
Chai, Senchun ;
Xia, Yuanqing .
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2019, 15 (07) :3899-3909
[4]   A Fast and Efficient Double-Tree RRT*-Like Sampling-Based Planner Applying on Mobile Robotic Systems [J].
Chen, Long ;
Shan, Yunxiao ;
Tian, Wei ;
Li, Bijun ;
Cao, Dongpu .
IEEE-ASME TRANSACTIONS ON MECHATRONICS, 2018, 23 (06) :2568-2578
[5]   Autonomous Waypoint Planning, Optimal Trajectory Generation and Nonlinear Tracking Control for Multi-rotor UAVs [J].
Eslamiat, Hossein ;
Li, Yilan ;
Wang, Ningshan ;
Sanyal, Amit K. ;
Qiu, Qinru .
2019 18TH EUROPEAN CONTROL CONFERENCE (ECC), 2019, :2695-2700
[6]  
Fortunato M., 2017, ARXIV
[7]   Plume Tracing via Model-Free Reinforcement Learning Method [J].
Hu, Hangkai ;
Song, Shiji ;
Chen, C. L. Phillip .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (08) :2515-2527
[8]   Road-Constrained Geometric Pose Estimation for Ground Vehicles [J].
Jiang, Rui ;
Zhou, Hui ;
Wang, Han ;
Ge, Shuzhi Sam .
IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2020, 17 (02) :748-760
[9]   Development of Autonomous Car-Part II: A Case Study on the Implementation of an Autonomous Driving System Based on Distributed Architecture [J].
Jo, Kichun ;
Kim, Junsoo ;
Kim, Dongchul ;
Jang, Chulhoon ;
Sunwoo, Myoungho .
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2015, 62 (08) :5119-5132
[10]   Development of Autonomous Car-Part I: Distributed System Architecture and Development Process [J].
Jo, Kichun ;
Kim, Junsoo ;
Kim, Dongchul ;
Jang, Chulhoon ;
Sunwoo, Myoungho .
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2014, 61 (12) :7131-7140