Efficient Attention Pyramid Network for Semantic Segmentation

被引:8
|
作者
Yang, Qirui [1 ,2 ,3 ]
Ku, Tao [1 ,2 ]
Hu, Kunyuan [1 ,2 ]
机构
[1] Chinese Acad Sci, Shenyang Inst Automat, Shenyang 110016, Peoples R China
[2] Chinese Acad Sci, Inst Robot & Intelligent Mfg, Shenyang 110169, Peoples R China
[3] Univ Chinese Acad Sci, Sch Comp & Control, Beijing 100049, Peoples R China
关键词
Semantics; Convolution; Feature extraction; Task analysis; Image segmentation; Decoding; Computer vision; Semantic segmentation; attention mechanism; spatial pyramid; PASCAL VOC 2012; Cityscapes;
D O I
10.1109/ACCESS.2021.3053316
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Semantic segmentation is a task that covers most of the perception needs of intelligent vehicles in an unified way. Recent studies witnessed that attention mechanisms achieve impressive performance in computer vision task. Current attention mechanisms based segmentation methods differ with each other in position and form of the attention mechanism, and perform differently in practice. This paper firstly introduces the effectiveness of multi-scale context features and attention mechanisms in segmentation tasks. We find that multi-scale and channel attention can play a vital role in constructing effective context features. Based on this analysis, this paper proposes an efficient attention pyramid network (EAPNet) for semantic segmentation. Specifically, to efficient handle the problem of segmenting objects at multiple scales, we design efficient channel attention pyramid (ECAP) which employ atrous convolution with channel attention in cascade or in parallel to capture multi-scale context by using multiple atrous rates. Furthermore, we propose a residual attention fusion block (RAFB), whose purpose is to simultaneously focus on meaningful low-level feature maps and spatial location information. At the same time, we will explore different channel attention modules and spatial attention modules, and describe their impact on network performance. We empirically evaluate our EAPNet on two semantic segmentation datasets, including PASCAL VOC 2012 and Cityscapes datasets. Experimental results show that without MS COCO pre-training and any post-processing, EAPNet achieved 81.7% mIoU on the PASCAL VOC 2012 validation set. With deeplabv3+ as the benchmark, EAPNet improve the model performance of more than 1.50% mIoU.
引用
收藏
页码:18867 / 18875
页数:9
相关论文
共 50 条
  • [41] GA-NET: Global Attention Network for Point Cloud Semantic Segmentation
    Deng, Shuang
    Dong, Qiulei
    IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 1300 - 1304
  • [42] Lightweight Attention Network for Very High-Resolution Image Semantic Segmentation
    Guan, Renchu
    Wang, Mingming
    Bruzzone, Lorenzo
    Zhao, Haishi
    Yang, Chen
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [43] Local Fusion Attention Network for Semantic Segmentation of Building Facade Point Clouds
    Su, Yanfei
    Liu, Weiquan
    Cheng, Ming
    Yuan, Zhimin
    Wang, Cheng
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [44] Semantic Attention and Scale Complementary Network for Instance Segmentation in Remote Sensing Images
    Zhang, Tianyang
    Zhang, Xiangrong
    Zhu, Peng
    Tang, Xu
    Li, Chen
    Jiao, Licheng
    Zhou, Huiyu
    IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (10) : 10999 - 11013
  • [45] Joint pyramid attention network for real-time semantic segmentation of urban scenes
    Hu, Xuegang
    Jing, Liyuan
    Sehar, Uroosa
    APPLIED INTELLIGENCE, 2022, 52 (01) : 580 - 594
  • [46] Joint pyramid attention network for real-time semantic segmentation of urban scenes
    Xuegang Hu
    Liyuan Jing
    Uroosa Sehar
    Applied Intelligence, 2022, 52 : 580 - 594
  • [47] Multi-Granularity Context Network for Efficient Video Semantic Segmentation
    Liang, Zhiyuan
    Dai, Xiangdong
    Wu, Yiqian
    Jin, Xiaogang
    Shen, Jianbing
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 3163 - 3175
  • [48] MSPAN: Multi-scale pyramid attention network for efficient skin cancer lesion segmentation
    Ahmed, Noor
    Xin, Tan
    Lizhuang, Ma
    IET IMAGE PROCESSING, 2024, 18 (07) : 1667 - 1680
  • [49] HCNet: Hierarchical Context Network for Semantic Segmentation
    Chong, Yanwen
    Nie, Congchong
    Tao, Yulong
    Chen, Xiaoshu
    Pan, Shaoming
    IEEE ACCESS, 2020, 8 : 179213 - 179223
  • [50] GPNet: Gated pyramid network for semantic segmentation
    Zhang, Yu
    Sun, Xin
    Dong, Junyu
    Chen, Changrui
    Lv, Qingxuan
    PATTERN RECOGNITION, 2021, 115