Asymmetric Cascade Fusion Network for Building Extraction

被引:15
作者
Chan, Sixian [1 ,2 ,3 ]
Wang, Yuan [1 ]
Lei, Yanjing [1 ]
Cheng, Xu [4 ]
Chen, Zhaomin [5 ]
Wu, Wei [1 ,2 ]
机构
[1] Zhejiang Univ Technol, Coll Comp Sci & Technol, Hangzhou 310023, Peoples R China
[2] Zhejiang Univ Technol, Coll Geog Informat Modern Ind, Hangzhou 310023, Peoples R China
[3] Nanjing Univ Informat Sci & Technol, KLME, CIC FEMD, Nanjing 210094, Jiangsu, Peoples R China
[4] Tianjin Univ Technol, Coll Comp Sci & Engn, Tianjin 300382, Peoples R China
[5] Wenzhou Univ, Coll Comp Sci & Artificial Intelligence, Wenzhou 325000, Peoples R China
来源
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2023年 / 61卷
基金
中国国家自然科学基金;
关键词
Buildings; Feature extraction; Transformers; Remote sensing; Decoding; Convolutional neural networks; Data mining; Asymmetric architecture; building extraction; multibranch weighted pyramid pooling; multigranularity; SEMANTIC SEGMENTATION; NET;
D O I
10.1109/TGRS.2023.3306018
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
The U-Net-like model has been widely studied in the field of building extraction. However, most of these models are based on locally sensed convolutional neural networks (CNNs) designed with symmetric structure and single feature processing, which cannot accurately identify buildings with different sizes, shapes, and colors in remote sensing images. To overcome these problems, we propose the asymmetric cascade fusion network (ACFN), based on the vision transformer (ViT), to design a novel asymmetric architecture to recognize buildings of different sizes and shapes by processing multigranularity features by different means. First, the asymmetric architecture obtains multigranularity features with global contextual information by embedding different types of attention in encoder-decoders of different sizes. This architecture can identify densely distributed and occluded buildings by semantic reasoning in remote sensing images with complex information. Second, we design a multibranch weighted pyramid pooling module (MWPPM), which sets different branch weights to offset the background noise introduced in introducing global contextual information. Our ACFN significantly improves the Beijing buildings, ISPRS-Vaihingen, and LoveDA datasets.
引用
收藏
页数:18
相关论文
共 70 条
  • [1] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
    Badrinarayanan, Vijay
    Kendall, Alex
    Cipolla, Roberto
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) : 2481 - 2495
  • [2] Cao H., 2021, arXiv, DOI DOI 10.48550/ARXIV.2105.05537
  • [3] Res2-UNeXt: a novel deep learning framework for few-shot cell image segmentation
    Chan, Sixian
    Huang, Cheng
    Bai, Cong
    Ding, Weilong
    Chen, Shengyong
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (10) : 13275 - 13288
  • [4] Chen J., 2021, arXiv
  • [5] Chen KQ, 2017, INT GEOSCI REMOTE SE, P1672, DOI 10.1109/IGARSS.2017.8127295
  • [6] Chen LC, 2017, Arxiv, DOI arXiv:1706.05587
  • [7] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
    Chen, Liang-Chieh
    Papandreou, George
    Kokkinos, Iasonas
    Murphy, Kevin
    Yuille, Alan L.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
  • [8] DR-Net: An Improved Network for Building Extraction from High Resolution Remote Sensing Image
    Chen, Meng
    Wu, Jianjun
    Liu, Leizhen
    Zhao, Wenhui
    Tian, Feng
    Shen, Qiu
    Zhao, Bingyu
    Du, Ruohua
    [J]. REMOTE SENSING, 2021, 13 (02) : 1 - 19
  • [9] Multiscale Feature Learning by Transformer for Building Extraction From Satellite Images
    Chen, Xin
    Qiu, Chunping
    Guo, Wenyue
    Yu, Anzhu
    Tong, Xiaochong
    Schmitt, Michael
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [10] Chu XX, 2021, ADV NEUR IN