An Adaptive Post-Processing Network With the Global-Local Aggregation for Semantic Segmentation

被引:6
作者
Zhu, Guilin [1 ]
Wang, Runmin [1 ]
Liu, Yingying [1 ]
Zhu, Zhenlin [1 ]
Gao, Changxin [2 ]
Liu, Li [3 ]
Sang, Nong [2 ]
机构
[1] Hunan Normal Univ, Sch Informat Sci & Engn, Changsha 410081, Peoples R China
[2] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Wuhan 430074, Peoples R China
[3] Natl Univ Def Technol, Sch Syst Engn, Changsha 410000, Peoples R China
基金
中国国家自然科学基金;
关键词
Context modeling; Semantic segmentation; Task analysis; Predictive models; Transformers; Modeling; Adaptation models; post-processing; global-local aggregation; pixel-aware attention; class-aware attention;
D O I
10.1109/TCSVT.2023.3292156
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Current semantic segmentation methods mainly focus on modeling the context of the global image to obtain high-quality segmentation results. However, they ignore the role of local image patches, which contain complementary and effective context information. In this paper, we propose an adaptive post-processing network (APPNet) for semantic segmentation based on the predictions of current methods in the global image and local image patches. The key point of APPNet is the global-local aggregation module, which models the context between global predictions and local predictions to generate accurate pixel-wise representation. Furthermore, we develop an adaptive points replacement module to compensate for the lack of fine detail in global prediction and the overconfidence in local predictions. Our method can be readily integrated into existing segmentation methods (i.e., ConvNeXt, HRNet, ViT-Adapter) with little memory and without extra modification in current models. We empirically demonstrate our method brings performance improvements across diverse datasets (i.e., Cityscapes, ADE20K, PASCAL-Context, COCO-Stuff).
引用
收藏
页码:1159 / 1173
页数:15
相关论文
共 71 条
  • [1] [Anonymous], 2016, P INT C LEARN REPR
  • [2] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
    Badrinarayanan, Vijay
    Kendall, Alex
    Cipolla, Roberto
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) : 2481 - 2495
  • [3] Bousselham W, 2022, Arxiv, DOI arXiv:2111.13280
  • [4] COCO-Stuff: Thing and Stuff Classes in Context
    Caesar, Holger
    Uijlings, Jasper
    Ferrari, Vittorio
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 1209 - 1218
  • [5] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
  • [6] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
    Chen, Liang-Chieh
    Zhu, Yukun
    Papandreou, George
    Schroff, Florian
    Adam, Hartwig
    [J]. COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 : 833 - 851
  • [7] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
    Chen, Liang-Chieh
    Papandreou, George
    Kokkinos, Iasonas
    Murphy, Kevin
    Yuille, Alan L.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
  • [8] Collaborative Global-Local Networks for Memory-Efficient Segmentation of Ultra-High Resolution Images
    Chen, Wuyang
    Jiang, Ziyu
    Wang, Zhangyang
    Cui, Kexin
    Qian, Xiaoning
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 8916 - 8925
  • [9] Chen Z, 2023, Arxiv, DOI [arXiv:2205.08534, DOI 10.48550/ARXIV.2205.08534]
  • [10] Cheng B, 2021, ADV NEUR IN, V34