Multi-task deep learning with optical flow features for self-driving cars

被引:3
作者
Hu, Yuan [1 ]
Shum, Hubert P. H. [2 ]
Ho, Edmond S. L. [3 ]
机构
[1] Henan Agr Univ, Dept Mech & Elect Engn, 63 Agr Ave, Zhengzhou, Peoples R China
[2] Univ Durham, Dept Comp Sci, Durham DH1 3LE, England
[3] Northumbria Univ, Dept Comp & Informat Sci, Newcastle Upon Tyne NE1 8ST, Tyne & Wear, England
关键词
traffic engineering computing; image sequences; video signal processing; feature extraction; learning (artificial intelligence); image motion analysis; automobiles; cameras; monocular dash camera; vehicle control; consecutive images; control signal; motion-based feature; flow predictor; self-supervised deep network; supervised multitask deep network; optical flow features; dash camera video; multitask deep learning; self-driving cars;
D O I
10.1049/iet-its.2020.0439
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The control of self-driving cars has received growing attention recently. Although existing research shows promising results in the vehicle control using video from a monocular dash camera, there has been very limited work on directly learning vehicle control from motion-based cues. Such cues are powerful features for visual representations, as they encode the per-pixel movement between two consecutive images, allowing a system to effectively map the features into the control signal. The authors propose a new framework that exploits the use of a motion-based feature known as optical flow extracted from the dash camera and demonstrates that such a feature is effective in significantly improving the accuracy of the control signals. The proposed framework involves two main components. The flow predictor, as a self-supervised deep network, models the underlying scene structure from consecutive frames and generates the optical flow. The controller, as a supervised multi-task deep network, predicts both steer angle and speed. The authors demonstrate that the proposed framework using the optical flow features can effectively predict control signals from a dash camera video. Using the Cityscapes data set, the authors validate that the system prediction has errors as low as 0.0130 rad/s on steer angle and 0.0615 m/s on speed, outperforming existing research.
引用
收藏
页码:1845 / 1854
页数:10
相关论文
共 37 条
[11]   The Cityscapes Dataset for Semantic Urban Scene Understanding [J].
Cordts, Marius ;
Omran, Mohamed ;
Ramos, Sebastian ;
Rehfeld, Timo ;
Enzweiler, Markus ;
Benenson, Rodrigo ;
Franke, Uwe ;
Roth, Stefan ;
Schiele, Bernt .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223
[12]  
Eitel A, 2015, IEEE INT C INT ROBOT, P681, DOI 10.1109/IROS.2015.7353446
[13]  
Geiger A, 2012, PROC CVPR IEEE, P3354, DOI 10.1109/CVPR.2012.6248074
[14]  
Hemachandra S, 2015, IEEE INT CONF ROBOT, P5608, DOI 10.1109/ICRA.2015.7139984
[15]  
Hou Y., 2018, AAAI C ART INT NEW O, P8433
[16]  
Kashyap H., 2019, ARXIV190303731
[17]   Textual Explanations for Self-Driving Vehicles [J].
Kim, Jinkyu ;
Rohrbach, Anna ;
Darrell, Trevor ;
Canny, John ;
Akata, Zeynep .
COMPUTER VISION - ECCV 2018, PT II, 2018, 11206 :577-593
[18]   Interpretable Learning for Self-Driving Cars by Visualizing Causal Attention [J].
Kim, Jinkyu ;
Canny, John .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :2961-2969
[19]   Particle filter-based vehicle tracking via HOG features after image stabilisation in intelligent drive system [J].
Kong, Xiaofang ;
Chen, Qian ;
Gu, Guohua ;
Ren, Kan ;
Qian, Weixian ;
Liu, Zewei .
IET INTELLIGENT TRANSPORT SYSTEMS, 2019, 13 (06) :942-949
[20]  
Kulkarni T.D., 2016, ADV NEURAL INFORM PR, P3682