Kinodynamic Motion Planning With Continuous-Time Q-Learning: An Online, Model-Free, and Safe Navigation Framework

被引:63
作者
Kontoudis, George P. [1 ]
Vamvoudakis, Kyriakos G. [2 ]
机构
[1] Virginia Tech, Kevin T Crofton Dept Aerosp & Ocean Engn, Blacksburg, VA 24060 USA
[2] Georgia Tech, Daniel Guggenhe Sch Aerosp Engn, Atlanta, GA 30332 USA
关键词
Planning; Heuristic algorithms; Optimal control; Dynamics; System dynamics; Computational modeling; Navigation; Actor; critic network; asymptotic optimality; online motion planning; Q-learning; LINEAR-SYSTEMS; DESIGN;
D O I
10.1109/TNNLS.2019.2899311
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents an online kinodynamic motion planning algorithmic framework using asymptotically optimal rapidly-exploring random tree (RRT) and continuous-time Q-learning, which we term as RRT-Q( star operator ). We formulate a model-free Q-based advantage function and we utilize integral reinforcement learning to develop tuning laws for the online approximation of the optimal cost and the optimal policy of continuous-time linear systems. Moreover, we provide rigorous Lyapunov-based proofs for the stability of the equilibrium point, which results in asymptotic convergence properties. A terminal state evaluation procedure is introduced to facilitate the online implementation. We propose a static obstacle augmentation and a local replanning framework, which are based on topological connectedness, to locally recompute the robot's path and ensure collision-free navigation. We perform simulations and a qualitative comparison to evaluate the efficacy of the proposed methodology.
引用
收藏
页码:3803 / 3817
页数:15
相关论文
共 43 条
[1]   The generalized Maxwell-slip model: A novel model for friction simulation and compensation [J].
Al-Bender, F ;
Lampaert, V ;
Swevers, J .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2005, 50 (11) :1883-1887
[2]  
[Anonymous], 2016, AIAA GUIDANCE NAVIGA
[3]  
[Anonymous], OPTIMAL ADAPTIVE CON
[4]  
[Anonymous], 2019, ANN REV CONTROL ROBO
[5]   GRADIENT DESCENT LEARNING ALGORITHM OVERVIEW - A GENERAL DYNAMICAL-SYSTEMS PERSPECTIVE [J].
BALDI, P .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1995, 6 (01) :182-195
[6]  
Berkenkamp F, 2015, 2015 EUROPEAN CONTROL CONFERENCE (ECC), P2496, DOI 10.1109/ECC.2015.7330913
[7]  
Bryson A.E., 2018, Applied optimal control: optimization, estimation and control
[8]  
Busoniu L, 2010, AUTOM CONTROL ENG SE, P1, DOI 10.1201/9781439821091-f
[9]   Concurrent Learning for Convergence in Adaptive Control without Persistency of Excitation [J].
Chowdhary, Girish ;
Johnson, Eric .
49TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2010, :3674-3679
[10]   KINODYNAMIC MOTION PLANNING [J].
DONALD, B ;
XAVIER, P ;
CANNY, J ;
REIF, J .
JOURNAL OF THE ACM, 1993, 40 (05) :1048-1066