Full-range adaptive cruise control based on supervised adaptive dynamic programming

被引:73
作者
Zhao, Dongbin [1 ]
Hu, Zhaohui [1 ,2 ]
Xia, Zhongpu [1 ]
Alippi, Cesare [1 ,3 ]
Zhu, Yuanheng [1 ]
Wang, Ding [1 ]
机构
[1] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China
[2] Guangdong Power Grid Corp, Elect Power Res Inst, Guangzhou 510080, Guangdong, Peoples R China
[3] Politecn Milan, Dipartimento Elettron & Informaz, I-20133 Milan, Italy
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
Adaptive dynamic programming; Supervised reinforcement learning; Neural networks; Adaptive cruise control; Stop and go; SYSTEMS;
D O I
10.1016/j.neucom.2012.09.034
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The paper proposes a supervised adaptive dynamic programming (SADP) algorithm for a full-range adaptive cruise control (ACC) system, which can be formulated as a dynamic programming problem with stochastic demands. The suggested ACC system has been designed to allow the host vehicle to drive both in highways and in Stop and Go (SG) urban scenarios. The ACC system can autonomously drive the host vehicle to a desired speed and/or a given distance from the target vehicle in both operational cases. Traditional adaptive dynamic programming (ADP) is a suitable tool to address the problem but training usually suffers from low convergence rates and hardly achieves an effective controller. A SADP algorithm which introduces the concept of inducing region is here introduced to overcome such training drawbacks. The SADP algorithm performs very well in all simulation scenarios and always better than more traditional controllers. The conclusion is that the proposed SADP algorithm is an effective control methodology able to effectively address the full-range ACC problem. (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:57 / 67
页数:11
相关论文
共 32 条
[1]  
[Anonymous], 1996, Neuro-dynamic programming
[2]  
[Anonymous], 2004, Handbook of learning and approximate dynamic programming
[3]  
Bai XR, 2009, INT J INNOV COMPUT I, V5, P3471
[4]  
Bifulco GN, 2008, IEEE INT VEH SYM, P528
[5]   A MEASURE OF ASYMPTOTIC EFFICIENCY FOR TESTS OF A HYPOTHESIS BASED ON THE SUM OF OBSERVATIONS [J].
CHERNOFF, H .
ANNALS OF MATHEMATICAL STATISTICS, 1952, 23 (04) :493-507
[6]  
Dietterich A., 2004, HDB LEARNING APPROXI, P47, DOI [10.1109/9780470544785.ch2, DOI 10.1109/9780470544785.CH2]
[7]   Particle swarm optimized adaptive dynamic programming [J].
Dongbin Zhao ;
Jianqiang Yi ;
Liu, Derong .
2007 IEEE INTERNATIONAL SYMPOSIUM ON APPROXIMATE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING, 2007, :32-+
[8]   Comparative analyses of three types of headway control systems for heavy commercial vehicles [J].
Fancher, PS ;
Peng, H ;
Bareket, Z .
VEHICLE SYSTEM DYNAMICS, 1996, 25 :139-151
[9]   Adaptive cruise control simulator [J].
Guvenc, Bilin Aksun ;
Kural, Emre .
IEEE CONTROL SYSTEMS MAGAZINE, 2006, 26 (03) :42-55
[10]   A three-network architecture for on-line learning and optimization based on adaptive dynamic programming [J].
He, Haibo ;
Ni, Zhen ;
Fu, Jian .
NEUROCOMPUTING, 2012, 78 (01) :3-13