Fast Deep Neural Networks With Knowledge Guided Training and Predicted Regions of Interests for Real-Time Video Object Detection

被引:46
作者
Cao, Wenming [1 ,2 ]
Yuan, Jianhe [1 ]
He, Zhihai [2 ]
Zhang, Zhi [2 ]
He, Zhiquan [1 ]
机构
[1] Shenzhen Univ, Shenzhen Key Lab Media Secur, Shenzhen 518060, Peoples R China
[2] Univ Missouri, Dept Elect & Comp Engn, Video Proc & Commun Lab, Columbia, MO 65211 USA
基金
美国国家科学基金会; 中国国家自然科学基金;
关键词
Assisted driving; deep neural networks; knowledge projection; speed optimization; vehicle detection; VEHICLE DETECTION; CLASSIFICATION; ARCHITECTURE; SYSTEM;
D O I
10.1109/ACCESS.2018.2795798
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
It has been recognized that deeper and wider neural networks are continuously advancing the state-of-the-art performance of various computer vision and machine learning tasks. However, they often require large sets of labeled data for effective training and suffer from extremely high computational complexity, preventing them from being deployed in real-time systems, for example vehicle object detection from vehicle cameras for assisted driving. In this paper, we aim to develop a fast deep neural network for real-time video object detection by exploring the ideas of knowledge-guided training and predicted regions of interest. Specifically, we will develop a new framework for training deep neural networks on datasets with limited labeled samples using cross-network knowledge projection which is able to improve the network performance while reducing the overall computational complexity significantly. A large pre-trained teacher network is used to observe samples from the training data. A projection matrix is learned to project this teacher-level knowledge and its visual representations from an intermediate layer of the teacher network to an intermediate layer of a thinner and faster student network to guide and regulate the training process. To further speed up the network, we propose to train a low-complexity object detection using traditional machine learning methods, such as support vector machine. Using this low-complexity object detector, we identify the regions of interest that contain the target objects with high confidence. We obtain a mathematical formula to estimate the regions of interest to save the computation for each convolution layer. Our experimental results on vehicle detection from videos demonstrated that the proposed method is able to speed up the network by up to 16 times while maintaining the object detection performance.
引用
收藏
页码:8990 / 8999
页数:10
相关论文
共 65 条
[1]  
[Anonymous], 2014, UNSUPERVISED DOMAIN
[2]  
[Anonymous], 2013, P MACHINE LEARNING R
[3]  
[Anonymous], PROC CVPR IEEE
[4]  
[Anonymous], 2014, DOMAIN ADVERSARIAL N
[5]  
[Anonymous], 2014, SPEEDING CONVOLUTION
[6]  
[Anonymous], PAC RIM INT C ART
[7]  
[Anonymous], 2014, COMPRESSING DEEP CON
[8]  
[Anonymous], 2009, Learning multiple layers of features from tiny images
[9]  
[Anonymous], 2009, P INT C ART INT STAT
[10]  
[Anonymous], CoRR