Semantic segmentation network with multi-path structure, attention reweighting and multi-scale encoding

被引:14
作者
Lin, Zhongkang [1 ,2 ]
Sun, Wei [1 ,2 ]
Tang, Bo [1 ,2 ]
Li, Jinda [1 ,2 ]
Yao, Xinyuan [1 ,2 ]
Li, Yu [1 ,2 ]
机构
[1] Wuhan Univ Sci & Technol, Key Lab Met Equipment & Control Technol, Wuhan 430081, Peoples R China
[2] Wuhan Univ Sci & Technol, Engn Res Ctr Met Automat & Measurement Technol, Wuhan 430081, Peoples R China
基金
中国国家自然科学基金;
关键词
Semantic segmentation; Multi-scale information; Attention mechanism; Multi-path networks;
D O I
10.1007/s00371-021-02360-7
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Semantic segmentation is an active field of computer vision. It provides semantic information for many applications. In semantic segmentation tasks, spatial information, context information, and high-level semantic information play an important role in improving segmentation accuracy. In this paper, a semantic segmentation network with multi-path structure, attention reweighting, and multi-scale encoding structure is proposed. Firstly, three parallel structures were designed, including a pyramid spatial path with a pyramid image input, a context path composed of a lightweight backbone network, and a semantic graph path composed of spatial graph convolutional layers. Secondly, a feature fusion module was designed to perform a weighted fusion of the output features of different paths based on the channel attention mechanism. Then, the semantic segmentation dataset CamVid and Cityscapes were used for network training. Finally, ablation experiments were carried out to verify the effectiveness of the proposed network components, and analyze the computational efficiency and segmentation accuracy of the model. The experimental results show that the semantic segmentation network can improve the accuracy of semantic segmentation by combining multi-scale information, high-level semantic information, and global context information while ensuring high computational efficiency.
引用
收藏
页码:597 / 608
页数:12
相关论文
共 58 条
[1]  
Andrew GHoward., 2017, MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
[2]  
[Anonymous], 2018, 2018 IEEE CVF C COMP
[3]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[4]   SnapNet: 3D point cloud semantic labeling with 2D deep segmentation networks [J].
Boulch, Alexandre ;
Guerry, Yids ;
Le Saux, Bertrand ;
Audebert, Nicolas .
COMPUTERS & GRAPHICS-UK, 2018, 71 :189-198
[5]   Segmentation and Recognition Using Structure from Motion Point Clouds [J].
Brostow, Gabriel J. ;
Shotton, Jamie ;
Fauqueur, Julien ;
Cipolla, Roberto .
COMPUTER VISION - ECCV 2008, PT I, PROCEEDINGS, 2008, 5302 :44-+
[6]  
Bruna Joan., 2014, C TRACK P, P14
[7]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[8]  
Chen LB, 2017, IEEE INT SYMP NANO, P1, DOI 10.1109/NANOARCH.2017.8053709
[9]   SPGNet: Semantic Prediction Guidance for Scene Parsing [J].
Cheng, Bowen ;
Chen, Liang-Chieh ;
Wei, Yunchao ;
Zhu, Yukun ;
Huang, Zilong ;
Xiong, Jinjun ;
Huang, Thomas S. ;
Hwu, Wen-Mei ;
Shi, Honghui .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :5217-5227
[10]   Xception: Deep Learning with Depthwise Separable Convolutions [J].
Chollet, Francois .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1800-1807