Bilateral attention decoder: A lightweight decoder for real-time semantic segmentation

被引:70
作者
Peng, Chengli [1 ]
Tian, Tian [2 ]
Chen, Chen [3 ]
Guo, Xiaojie [4 ]
Ma, Jiayi [1 ]
机构
[1] Wuhan Univ, Elect Informat Sch, Wuhan 430072, Peoples R China
[2] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Wuhan 430074, Peoples R China
[3] Univ North Carolina, Dept Elect & Comp Engn, Charlotte, NC 28223 USA
[4] Tianjin Univ, Sch Comp Software, Tianjin 300350, Peoples R China
关键词
Semantic segmentation; Real time; Deep learning; Attention mechanism; NETWORK;
D O I
10.1016/j.neunet.2021.01.021
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The encoder-decoder structure has been introduced into semantic segmentation to improve the spatial accuracy of the network by fusing high- and low-level feature maps. However, recent state-of-the-art encoder-decoder-based methods can hardly attain the real-time requirement due to their complex and inefficient decoders. To address this issue, in this paper, we propose a lightweight bilateral attention decoder for real-time semantic segmentation. It consists of two blocks and can fuse different level feature maps via two steps, i.e., information refinement and information fusion. In the first step, we propose a channel attention branch to refine the high-level feature maps and a spatial attention branch for the low-level ones. The refined high-level feature maps can capture more exact semantic information and the refined low-level ones can capture more accurate spatial information, which significantly improves the information capturing ability of these feature maps. In the second step, we develop a new fusion module named pooling fusing block to fuse the refined high- and low-level feature maps. This fusion block can take full advantages of the high- and low-level feature maps, leading to high-quality fusion results. To verify the efficiency of the proposed bilateral attention decoder, we adopt a lightweight network as the backbone and compare our proposed method with other state-of-the-art real-time semantic segmentation methods on the Cityscapes and Camvid datasets. Experimental results demonstrate that our proposed method can achieve better performance with a higher inference speed. Moreover, we compare our proposed network with several state-of-the-art non-real-time semantic segmentation methods and find that our proposed network can also attain better segmentation performance. (C) 2021 Elsevier Ltd. All rights reserved.
引用
收藏
页码:188 / 199
页数:12
相关论文
共 40 条
[1]  
[Anonymous], 2016, ENET DEEP NEURAL NET
[2]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[3]  
Bahdanau D, 2016, Arxiv, DOI arXiv:1409.0473
[4]   Semantic object classes in video: A high-definition ground truth database [J].
Brostow, Gabriel J. ;
Fauqueur, Julien ;
Cipolla, Roberto .
PATTERN RECOGNITION LETTERS, 2009, 30 (02) :88-97
[5]   Importance-Aware Semantic Segmentation for Autonomous Vehicles [J].
Chen, Bike ;
Gong, Chen ;
Yang, Jian .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2019, 20 (01) :137-148
[6]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[7]  
Chen LB, 2017, IEEE INT SYMP NANO, P1, DOI 10.1109/NANOARCH.2017.8053709
[8]   The Cityscapes Dataset for Semantic Urban Scene Understanding [J].
Cordts, Marius ;
Omran, Mohamed ;
Ramos, Sebastian ;
Rehfeld, Timo ;
Enzweiler, Markus ;
Benenson, Rodrigo ;
Franke, Uwe ;
Roth, Stefan ;
Schiele, Bernt .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223
[9]   Exploring New Backbone and Attention Module for Semantic Segmentation in Street Scenes [J].
Fan, Lei ;
Wang, Wei-Chien ;
Zha, Fuyuan ;
Yan, Jiapeng .
IEEE ACCESS, 2018, 6 :71566-71580
[10]   Dual Attention Network for Scene Segmentation [J].
Fu, Jun ;
Liu, Jing ;
Tian, Haijie ;
Li, Yong ;
Bao, Yongjun ;
Fang, Zhiwei ;
Lu, Hanqing .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3141-3149