Very Fast Semantic Image Segmentation Using Hierarchical Dilation and Feature Refining

被引:0
作者
Qingqun Ning
Jianke Zhu
Chun Chen
机构
[1] Zhejiang University,College of Computer Science
[2] Alibaba-Zhejiang University Joint Research Institute of Frontier Technologies,undefined
来源
Cognitive Computation | 2018年 / 10卷
关键词
Semantic image segmentation; Real-time system; Convolution neural network; Receptive field; Coarse-to-fine;
D O I
暂无
中图分类号
学科分类号
摘要
With the rapid development of deep learning techniques, semantic image segmentation has been considerably improved recently, which is viewed as the key problem of scene understanding in computer vision. These advances are built upon the capability of complex architectures for deep neural network. In this paper, we present a novel deep neural network architecture designed for semantic image segmentation. In order to improve the segmentation accuracy, we introduce a novel hierarchical dilation block to effectively enlarge the size of receptive field and enable multi-scale processing in fully convolutional neural network. Moreover, we exploit the technique of bypass and intermediate supervision to capture the context information during upsampling and refining coarse features. We have conducted extensive experiments on several popular semantic segmentation testbeds, including Cityscapes, CamVid, Kitti, and Helen facial datasets. The experimental results demonstrate that our proposed approach runs two times faster than the state-of-the-art method. Our full system is able to obtain realtime inference performance on 1080P images using a PC with single GPU. It executes a network forwarding at 200fps in our experiment while retaining high accuracy. Our proposed approach not only runs faster than the existing realtime methods but also performs on par with them.
引用
收藏
页码:62 / 72
页数:10
相关论文
共 38 条
[1]  
Brostow GJ(2009)Semantic object classes in video: A high-definition ground truth database Pattern Recogn Lett 30 88-97
[2]  
Fauqueur J(2013)Vision meets robotics: The kitti dataset Int J Robot Res 32 1231-1237
[3]  
Cipolla R(2009)Cognitive computation with autonomously active neural networks: an emerging field Cogn Comput 1 77-90
[4]  
Geiger A(2012)Interactive facial feature localization Comput Vision–ECCV 2012 679-692
[5]  
Lenz P(2017)Fully convolutional networks for semantic segmentation IEEE Trans Pattern Anal Mach Intell 39 640-651
[6]  
Stiller C(2016)A real-time active pedestrian tracking system inspired by the human visual system Cogn Comput 8 39-51
[7]  
Urtasun R(2017)Ensemble of deep neural networks with probability-based fusion for facial expression recognition Cogn Comput 9 597-610
[8]  
Gros C(2017)Semantic image segmentation method with multiple adjacency trees and multiscale features Cogn Comput 9 168-179
[9]  
Le V(2015)Biologically motivated model for outdoor scene classification Cogn Comput 7 20-33
[10]  
Brandt J(undefined)undefined undefined undefined undefined-undefined