Receding-Horizon Reinforcement Learning Approach for Kinodynamic Motion Planning of Autonomous Vehicles

被引：31

作者：

Zhang, Xinglong ^{[1
]}

Jiang, Yan ^{[1
]}

Lu, Yang ^{[1
]}

Xu, Xin ^{[1
]}

机构：

[1] Natl Univ Def Technol, Coll Intelligence Sci & Technol, Changsha 410073, Peoples R China

来源：

IEEE TRANSACTIONS ON INTELLIGENT VEHICLES | 2022年 / 7卷 / 03期

基金：

国家重点研发计划; 中国国家自然科学基金; 中国博士后科学基金;

关键词：

Kinodynamic planning; receding horizon; reinforcement learning; autonomous vehicles; MODEL-PREDICTIVE CONTROL; OPTIMIZATION; NAVIGATION;

D O I：

10.1109/TIV.2022.3167271

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Kinodynamic motion planning is critical for autonomous vehicles with high maneuverability in dynamic environments. However, obtaining near-optimal motion planning solutions with low computational costs and inaccurate prior model information is challenging. To address this issue, this paper proposes a receding-horizon reinforcement learning approach for kinodynamic motion planning (RHRL-K DP) of autonomous vehicles in the presence of inaccurate dynamics information and moving obstacles. Specifically, a receding-horizon actor-critic reinforcement learning algorithm is presented, resulting in a neural network-based planning strategy that can be learned both offline and online. A neural network-based model is built and learned online to approximate the modeling uncertainty of the prior nominal model in order to improve planning performance. Furthermore, active collision avoidance in dynamic environments is realized by constructing safety-related terms in actor and critic networks using potential fields. In theory, the uniformly ultimate boundedness property of the modeling uncertainty's approximation error is proven, and the convergence of the proposed RHRL-KDP is analyzed. Simulation tests show that our approach outperforms the previously developed motion planners based on model predictive control (MPC), safe RL, and RRT* in terms of planning performance. Furthermore, in both online and offline learning scenarios, RHRL-KDP outperforms MPC and RRT* in terms of computational efficiency.

引用

页码：556 / 568

页数：13

共 54 条

[1]

Achiam J, 2017, PR MACH LEARN RES, V70

[2] Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof [J].

Al-Tamimi, Asma ;

Lewis, Frank L. ;

Abu-Khalaf, Murad .

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2008, 38 (04) :943-949

[3] Real-time Approximation of Clothoids With Bounded Error for Path Planning Applications [J].

Brezak, Misel ;

Petrovic, Ivan .

IEEE TRANSACTIONS ON ROBOTICS, 2014, 30 (02) :507-515

[4] Safe driving envelopes for path tracking in autonomous vehicles [J].

Brown, Matthew ;

Funke, Joseph ;

Erlien, Stephen ;

Gerdes, J. Christian .

CONTROL ENGINEERING PRACTICE, 2017, 61 :307-316

[5] Future Directions of Intelligent Vehicles: Potentials, Possibilities, and Perspectives [J].

Cao, Dongpu ;

Wang, Xiao ;

Li, Lingxi ;

Lv, Chen ;

Na, Xiaoxiang ;

Xing, Yang ;

Li, Xuan ;

Li, Ying ;

Chen, Yuanyuan ;

Wang, Fei-Yue .

IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2022, 7 (01) :7-10

[6] RL-RRT: Kinodynamic Motion Planning via Learning Reachability Estimators From RL Policies [J].

Chiang, Hao-Tien Lewis ;

Hsu, Jasmine ;

Fiser, Marek ;

Tapia, Lydia ;

Faust, Aleksandra .

IEEE ROBOTICS AND AUTOMATION LETTERS, 2019, 4 (04) :4298-4305

[7]

Chow Y, 2019, Arxiv, DOI arXiv:1901.10031

[8] KINODYNAMIC MOTION PLANNING [J].

DONALD, B ;

XAVIER, P ;

CANNY, J ;

REIF, J .

JOURNAL OF THE ACM, 1993, 40 (05) :1048-1066

[9] Functional Nonlinear Model Predictive Control Based on Adaptive Dynamic Programming [J].

Dong, Lu ;

Yan, Jun ;

Yuan, Xin ;

He, Haibo ;

Sun, Changyin .

IEEE TRANSACTIONS ON CYBERNETICS, 2019, 49 (12) :4206-4218

[10] Nonlinear Model Predictive Control for Constrained Output Path Following [J].

Faulwasser, Timm ;

Findeisen, Rolf .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2016, 61 (04) :1026-1039

← 1 2 3 4 5 6 →