Multi-task deep learning with optical flow features for self-driving cars

被引:3
作者
Hu, Yuan [1 ]
Shum, Hubert P. H. [2 ]
Ho, Edmond S. L. [3 ]
机构
[1] Henan Agr Univ, Dept Mech & Elect Engn, 63 Agr Ave, Zhengzhou, Peoples R China
[2] Univ Durham, Dept Comp Sci, Durham DH1 3LE, England
[3] Northumbria Univ, Dept Comp & Informat Sci, Newcastle Upon Tyne NE1 8ST, Tyne & Wear, England
关键词
traffic engineering computing; image sequences; video signal processing; feature extraction; learning (artificial intelligence); image motion analysis; automobiles; cameras; monocular dash camera; vehicle control; consecutive images; control signal; motion-based feature; flow predictor; self-supervised deep network; supervised multitask deep network; optical flow features; dash camera video; multitask deep learning; self-driving cars;
D O I
10.1049/iet-its.2020.0439
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The control of self-driving cars has received growing attention recently. Although existing research shows promising results in the vehicle control using video from a monocular dash camera, there has been very limited work on directly learning vehicle control from motion-based cues. Such cues are powerful features for visual representations, as they encode the per-pixel movement between two consecutive images, allowing a system to effectively map the features into the control signal. The authors propose a new framework that exploits the use of a motion-based feature known as optical flow extracted from the dash camera and demonstrates that such a feature is effective in significantly improving the accuracy of the control signals. The proposed framework involves two main components. The flow predictor, as a self-supervised deep network, models the underlying scene structure from consecutive frames and generates the optical flow. The controller, as a supervised multi-task deep network, predicts both steer angle and speed. The authors demonstrate that the proposed framework using the optical flow features can effectively predict control signals from a dash camera video. Using the Cityscapes data set, the authors validate that the system prediction has errors as low as 0.0130 rad/s on steer angle and 0.0615 m/s on speed, outperforming existing research.
引用
收藏
页码:1845 / 1854
页数:10
相关论文
共 37 条
[1]  
[Anonymous], 2016, ARXIV160801230
[2]  
[Anonymous], 2017, ARXIV
[3]   Bounding Boxes, Segmentations and Object Coordinates: How Important is Recognition for 3D Scene Flow Estimation in Autonomous Driving Scenarios? [J].
Behl, Aseem ;
Jafari, Omid Hosseini ;
Mustikovela, Siva Karthik ;
Abu Alhaija, Hassan ;
Rother, Carsten ;
Geiger, Andreas .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :2593-2602
[4]  
Bojarski Mariusz, 2016, arXiv
[5]   Real-time natural language corrections for assistive robotic manipulators [J].
Broad, Alexander ;
Arkin, Jacob ;
Ratliff, Nathan ;
Howard, Thomas ;
Argall, Brenna .
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2017, 36 (5-7) :684-698
[6]   High accuracy optical flow estimation based on a theory for warping [J].
Brox, T ;
Bruhn, A ;
Papenberg, N ;
Weickert, J .
COMPUTER VISION - ECCV 2004, PT 4, 2004, 2034 :25-36
[7]  
Casser V, 2019, AAAI CONF ARTIF INTE, P8001
[8]   DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving [J].
Chen, Chenyi ;
Seff, Ari ;
Kornhauser, Alain ;
Xiao, Jianxiong .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :2722-2730
[9]   MultiNet: Multi-Modal Multi-Task Learning for Autonomous Driving [J].
Chowdhuri, Sauhaarda ;
Pankaj, Tushar ;
Zipser, Karl .
2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, :1496-1504
[10]  
Codevilla F, 2018, IEEE INT CONF ROBOT, P4693