Efficient Semantic Segmentation Using Multi-Path Decoder

被引:3
作者
Bai, Xing [1 ,2 ]
Zhou, Jun [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Acoust, Key Lab Speech Acoust & Content Understanding, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100190, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2020年 / 10卷 / 18期
基金
中国国家自然科学基金;
关键词
deep learning; semantic image segmentation; convolutional neural network;
D O I
10.3390/app10186386
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Benefiting from the booming of deep learning, the state-of-the-art models achieved great progress. But they are huge in terms of parameters and floating point operations, which makes it hard to apply them to real-time applications. In this paper, we propose a novel deep neural network architecture, named MPDNet, for fast and efficient semantic segmentation under resource constraints. First, we use a light-weight classification model pretrained on ImageNet as the encoder. Second, we use a cost-effective upsampling datapath to restore prediction resolution and convert features for classification into features for segmentation. Finally, we propose to use a multi-path decoder to extract different types of features, which are not ideal to process inside only one convolutional neural network. The experimental results of our model outperform other models aiming at real-time semantic segmentation on Cityscapes. Based on our proposed MPDNet, we achieve 76.7% mean IoU on Cityscapes test set with only 118.84GFLOPs and achieves 37.6 Hz on 768 x 1536 images on a standard GPU.
引用
收藏
页数:10
相关论文
共 32 条
[1]  
[Anonymous], 2018, COMPUTER VISION ECCV
[2]  
[Anonymous], 2018, ARXIV180409337
[3]  
Chaurasia A, 2017, 2017 IEEE VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP)
[4]  
Chen L.C., 2014, Comput. Sci.
[5]  
Ding H., 2019, ARXIV190900179
[6]   Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation [J].
Ghiasi, Golnaz ;
Fowlkes, Charless C. .
COMPUTER VISION - ECCV 2016, PT III, 2016, 9907 :519-534
[7]  
He KM, 2014, LECT NOTES COMPUT SC, V8691, P346, DOI [arXiv:1406.4729, 10.1007/978-3-319-10578-9_23]
[8]  
Hong Z.-W., 2018, ARXIV180200285
[9]  
Ioffe S, 2015, PR MACH LEARN RES, V37, P448
[10]   Gated Feedback Refinement Network for Dense Image Labeling [J].
Islam, Md Amirul ;
Rochan, Mrigank ;
Bruce, Neil D. B. ;
Wang, Yang .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4877-4885