Intelligent Land-Vehicle Model Transfer Trajectory Planning Method Based on Deep Reinforcement Learning

被引：41

作者：

Yu, Lingli ^{[1
,2
,3
]}

Shao, Xuanya ^{[1
]}

Wei, Yadong ^{[1
]}

Zhou, Kaijun ^{[4
]}

机构：

[1] Cent South Univ, Sch Informat Sci & Engn, Changsha 410083, Hunan, Peoples R China

[2] Harbin Inst Technol, State Key Lab Robot & Syst, Haerbin 150001, Peoples R China

[3] Chongqing Univ, State Key Lab Mech Transmiss, Chongqing 400044, Peoples R China

[4] Hunan Univ Commerce, Sch Comp & Informat Engn, Changsha 410205, Hunan, Peoples R China

来源：

SENSORS | 2018年 / 18卷 / 09期

基金：

中国国家自然科学基金;

关键词：

intelligent driving vehicle; trajectory planning; end-to-end; deep reinforcement learning; model transfer; GENERATION;

D O I：

10.3390/s18092905

中图分类号：

O65 [分析化学];

学科分类号：

070302 ; 081704 ;

摘要：

To address the problem of model error and tracking dependence in the process of intelligent vehicle motion planning, an intelligent vehicle model transfer trajectory planning method based on deep reinforcement learning is proposed, which is able to obtain an effective control action sequence directly. Firstly, an abstract model of the real environment is extracted. On this basis, a deep deterministic policy gradient (DDPG) and a vehicle dynamic model are adopted to jointly train a reinforcement learning model, and to decide the optimal intelligent driving maneuver. Secondly, the actual scene is transferred to an equivalent virtual abstract scene using a transfer model. Furthermore, the control action and trajectory sequences are calculated according to the trained deep reinforcement learning model. Thirdly, the optimal trajectory sequence is selected according to an evaluation function in the real environment. Finally, the results demonstrate that the proposed method can deal with the problem of intelligent vehicle trajectory planning for continuous input and continuous output. The model transfer method improves the model's generalization performance. Compared with traditional trajectory planning, the proposed method outputs continuous rotation-angle control sequences. Moreover, the lateral control errors are also reduced.

引用

页数：22

共 35 条

[1]

Abseher M, 2015, PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), P275

[2]

Andrychowicz M., 2017, NIPS, P5055

[3]

[Anonymous], 2018, P INT C LEARNING REP

[4]

[Anonymous], 2017, ARXIV170501196

[5]

[Anonymous], 2017, IEEE INT C ROBOTICS, DOI DOI 10.1109/ICRA.2017.7989385

[6]

Bahdanau D., 2016, An Actor-Critic Algorithm for Sequence Prediction

[7] Motion Planning for an Autonomous Vehicle Driving on Motorways by Using Flatness Properties [J].

Cong, Yanfeng ;

Sawodny, Oliver ;

Chen, Hong ;

Zimmermann, Jan ;

Lutz, Alexander .

2010 IEEE INTERNATIONAL CONFERENCE ON CONTROL APPLICATIONS, 2010, :908-913

[8] Optimal Polygon Decomposition for UAV Survey Coverage Path Planning in Wind [J].

Coombes, Matthew ;

Fletcher, Tom ;

Chen, Wen-Hua ;

Liu, Cunjia .

SENSORS, 2018, 18 (07)

[9] Odometry and Laser Scanner Fusion Based on a Discrete Extended Kalman Filter for Robotic Platooning Guidance [J].

Espinosa, Felipe ;

Santos, Carlos ;

Marron-Romera, Marta ;

Pizarro, Daniel ;

Valdes, Fernando ;

Dongil, Javier .

SENSORS, 2011, 11 (09) :8339-8357

[10]

Foster DJ, 2000, HIPPOCAMPUS, V10, P1, DOI 10.1002/(SICI)1098-1063(2000)10:1<1::AID-HIPO1>3.0.CO

← 1 2 3 4 →