Sum-fusion and Cascaded interpolation for Semantic Image Segmentation

被引：8

作者：

Wang, Yan ^{[1
]}

Hu, Jiani ^{[1
]}

Deng, Weihong ^{[1
]}

机构：

[1] Beijing Univ Posts & Telecommun, Beijing 100876, Peoples R China

来源：

PROCEEDINGS 2017 4TH IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR) | 2017年

基金：

中国国家自然科学基金;

关键词：

D O I：

10.1109/ACPR.2017.75

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Semantic image segmentation classifies every pixel in an image into categories but it is difficult for a model to be good at extracting features of every category for segmentation. As features in a model may be excel at classifying a specific class, combining different models may yield a better throughput, but it necessitates heavy parameter tuning. We propose to compromise to combine several convolutional layers of different kernel sizes to get more detailed information. In our proposed algorithm, we preserve the original structure of fully convolution network but replace the convolution layer after the last Pooling layer with four convolution layers of different kernel sizes to extract multi-scale information and then four sets of feature maps obtained after the four layers are element-wise sum-fused to one set followed with convolution operation. We also propose to employ cascaded interpolation for deconvolution to get score maps as large as the corresponding input image. We evaluate our algorithm on SIFTFLOW dataset, and we really improve the segmentation accuracy.

引用

页码：712 / 717

页数：6

共 15 条

[1] [Anonymous], 2014, INF SOFTW TECHNOL
[2] [Anonymous], 2016, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2016.492
[3] [Anonymous], 2015, ABS150507293
[4] [Anonymous], 2015, Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs, DOI DOI 10.1080/17476938708814211
[5] Learning Hierarchical Features for Scene Labeling
Farabet, Clement
Couprie, Camille
Najman, Laurent
LeCun, Yann
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) : 1915 - 1929
[6] Hong S., 2015, COMPUTER SCI, P2
[7] Caffe: Convolutional Architecture for Fast Feature Embedding
Jia, Yangqing
Shelhamer, Evan
Donahue, Jeff
Karayev, Sergey
Long, Jonathan
Girshick, Ross
Guadarrama, Sergio
Darrell, Trevor
[J]. PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, : 675 - 678
[8] Krahenbuhl P., 2011, P ADV NEUR INF PROC, P109, DOI DOI 10.1109/CVPR.2012.6247724
[9] Mallat S., 1999, WAVELET TOUR SIGNAL, V31, P83
[10] Learning Deconvolution Network for Semantic Segmentation
Noh, Hyeonwoo
Hong, Seunghoon
Han, Bohyung
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 1520 - 1528

← 1 2 →