A Strip Dilated Convolutional Network for Semantic Segmentation

被引:7
作者
Zhou, Yan [1 ]
Zheng, Xihong [1 ]
Ouyang, Wanli [2 ]
Li, Baopu [3 ]
机构
[1] Xiangtan Univ, Sch Automat & Elect Informat, Xiangtan 411105, Peoples R China
[2] Univ Sydney, Sch Elect & Informat, Camperdown, NSW 2006, Australia
[3] Baidu Res USA, Sunnyvale, CA 94089 USA
基金
中国国家自然科学基金;
关键词
Semantic segmentation; Multi-scale contexts; Encoder-decoder; Multi-scale strip pooling module; Strip dilated convolution module; ATTENTION;
D O I
10.1007/s11063-022-11048-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
There are frequently a large number of strip objects in segmentation scenarios, and the use of conventional square convolution may yield redundant information. Based on our previously proposed SA-FFNet (Zhou et al. in Neurocomputing 453:50-59, 2021), we study the effect of strip sub-region information extraction on semantic segmentation and propose a network. Our method is conducive to extracting multi-scale strip objects that often appear in segmentation scenes, and using strip dilated convolution to further extract contextual dependencies in other directions. First, we propose a multi-scale strip pooling module that enables the backbone network to effectively obtain multi-scale contexts; Then, we introduce a strip dilated convolution module, which supplements the vertical contexts of the strip pooling by using strip dilated convolution; Finally, we construct a novel network integrating the proposed two modules. The method explicitly takes horizontal and vertical contexts of multi-scale strip objects into consideration, so that scene understanding could benefit from long-range dependencies. The experimental results on the widely used PASCAL VOC 2012 and Cityscapes scene analysis benchmark datasets, which are better than the existing OCRNet, DeeplabV3+, SPNet, etc, both qualitatively and quantitatively.
引用
收藏
页码:4439 / 4459
页数:21
相关论文
共 53 条
[1]   Weakly supervised semantic segmentation by iteratively refining optimal segmentation with deep cues guidance [J].
Al-Huda, Zaid ;
Peng, Bo ;
Yang, Yan ;
Algburi, Riyadh Nazar Ali ;
Ahmad, Muqeet ;
Khurshid, Faisal ;
Moghalles, Khaled .
NEURAL COMPUTING & APPLICATIONS, 2021, 33 (15) :9035-9060
[2]  
[Anonymous], 2015, CVPR Workshop on the Future of Datasets in Vision
[3]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[4]   Information aggregation and fusion in deep neural networks for object interaction exploration for semantic segmentation [J].
Bai, Shuang ;
Wang, Congcong .
KNOWLEDGE-BASED SYSTEMS, 2021, 218
[5]   Depth-Quality-Aware Salient Object Detection [J].
Chen, Chenglizhao ;
Wei, Jipeng ;
Peng, Chong ;
Qin, Hong .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :2350-2363
[6]   Improved Saliency Detection in RGB-D Images Using Two-Phase Depth Estimation and Selective Deep Fusion [J].
Chen, Chenglizhao ;
Wei, Jipeng ;
Peng, Chong ;
Zhang, Weizhong ;
Qin, Hong .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :4296-4307
[7]   Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J].
Chen, Liang-Chieh ;
Zhu, Yukun ;
Papandreou, George ;
Schroff, Florian ;
Adam, Hartwig .
COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :833-851
[8]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[9]  
Chen LB, 2017, IEEE INT SYMP NANO, P1, DOI 10.1109/NANOARCH.2017.8053709
[10]   Deformable Convolutional Networks [J].
Dai, Jifeng ;
Qi, Haozhi ;
Xiong, Yuwen ;
Li, Yi ;
Zhang, Guodong ;
Hu, Han ;
Wei, Yichen .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :764-773