Very Fast Semantic Image Segmentation Using Hierarchical Dilation and Feature Refining

被引：0

作者：

Qingqun Ning

Jianke Zhu

Chun Chen

机构：

[1] Zhejiang University,College of Computer Science

[2] Alibaba-Zhejiang University Joint Research Institute of Frontier Technologies,undefined

来源：

Cognitive Computation | 2018年 / 10卷

关键词：

Semantic image segmentation; Real-time system; Convolution neural network; Receptive field; Coarse-to-fine;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

With the rapid development of deep learning techniques, semantic image segmentation has been considerably improved recently, which is viewed as the key problem of scene understanding in computer vision. These advances are built upon the capability of complex architectures for deep neural network. In this paper, we present a novel deep neural network architecture designed for semantic image segmentation. In order to improve the segmentation accuracy, we introduce a novel hierarchical dilation block to effectively enlarge the size of receptive field and enable multi-scale processing in fully convolutional neural network. Moreover, we exploit the technique of bypass and intermediate supervision to capture the context information during upsampling and refining coarse features. We have conducted extensive experiments on several popular semantic segmentation testbeds, including Cityscapes, CamVid, Kitti, and Helen facial datasets. The experimental results demonstrate that our proposed approach runs two times faster than the state-of-the-art method. Our full system is able to obtain realtime inference performance on 1080P images using a PC with single GPU. It executes a network forwarding at 200fps in our experiment while retaining high accuracy. Our proposed approach not only runs faster than the existing realtime methods but also performs on par with them.

引用

页码：62 / 72

页数：10

共 38 条

[1]

Brostow GJ(2009)Semantic object classes in video: A high-definition ground truth database Pattern Recogn Lett 30 88-97

[2]

Fauqueur J(2013)Vision meets robotics: The kitti dataset Int J Robot Res 32 1231-1237

[3]

Cipolla R(2009)Cognitive computation with autonomously active neural networks: an emerging field Cogn Comput 1 77-90

[4]

Geiger A(2012)Interactive facial feature localization Comput Vision–ECCV 2012 679-692

[5]

Lenz P(2017)Fully convolutional networks for semantic segmentation IEEE Trans Pattern Anal Mach Intell 39 640-651

[6]

Stiller C(2016)A real-time active pedestrian tracking system inspired by the human visual system Cogn Comput 8 39-51

[7]

Urtasun R(2017)Ensemble of deep neural networks with probability-based fusion for facial expression recognition Cogn Comput 9 597-610

[8]

Gros C(2017)Semantic image segmentation method with multiple adjacency trees and multiscale features Cogn Comput 9 168-179

[9]

Le V(2015)Biologically motivated model for outdoor scene classification Cogn Comput 7 20-33

[10]

Brandt J(undefined)undefined undefined undefined undefined-undefined

← 1 2 3 4 →