Asymmetric Cascade Fusion Network for Building Extraction

被引:17
作者
Chan, Sixian [1 ,2 ,3 ]
Wang, Yuan [1 ]
Lei, Yanjing [1 ]
Cheng, Xu [4 ]
Chen, Zhaomin [5 ]
Wu, Wei [1 ,2 ]
机构
[1] Zhejiang Univ Technol, Coll Comp Sci & Technol, Hangzhou 310023, Peoples R China
[2] Zhejiang Univ Technol, Coll Geog Informat Modern Ind, Hangzhou 310023, Peoples R China
[3] Nanjing Univ Informat Sci & Technol, KLME, CIC FEMD, Nanjing 210094, Jiangsu, Peoples R China
[4] Tianjin Univ Technol, Coll Comp Sci & Engn, Tianjin 300382, Peoples R China
[5] Wenzhou Univ, Coll Comp Sci & Artificial Intelligence, Wenzhou 325000, Peoples R China
来源
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2023年 / 61卷
基金
中国国家自然科学基金;
关键词
Buildings; Feature extraction; Transformers; Remote sensing; Decoding; Convolutional neural networks; Data mining; Asymmetric architecture; building extraction; multibranch weighted pyramid pooling; multigranularity; SEMANTIC SEGMENTATION; NET;
D O I
10.1109/TGRS.2023.3306018
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
The U-Net-like model has been widely studied in the field of building extraction. However, most of these models are based on locally sensed convolutional neural networks (CNNs) designed with symmetric structure and single feature processing, which cannot accurately identify buildings with different sizes, shapes, and colors in remote sensing images. To overcome these problems, we propose the asymmetric cascade fusion network (ACFN), based on the vision transformer (ViT), to design a novel asymmetric architecture to recognize buildings of different sizes and shapes by processing multigranularity features by different means. First, the asymmetric architecture obtains multigranularity features with global contextual information by embedding different types of attention in encoder-decoders of different sizes. This architecture can identify densely distributed and occluded buildings by semantic reasoning in remote sensing images with complex information. Second, we design a multibranch weighted pyramid pooling module (MWPPM), which sets different branch weights to offset the background noise introduced in introducing global contextual information. Our ACFN significantly improves the Beijing buildings, ISPRS-Vaihingen, and LoveDA datasets.
引用
收藏
页数:18
相关论文
共 70 条
[1]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[2]  
Cao H., 2021, arXiv
[3]   Res2-UNeXt: a novel deep learning framework for few-shot cell image segmentation [J].
Chan, Sixian ;
Huang, Cheng ;
Bai, Cong ;
Ding, Weilong ;
Chen, Shengyong .
MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (10) :13275-13288
[4]  
Chen J, 2021, arXiv
[5]  
Chen KQ, 2017, INT GEOSCI REMOTE SE, P1672, DOI 10.1109/IGARSS.2017.8127295
[6]  
Chen LC, 2017, Arxiv, DOI arXiv:1706.05587
[7]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[8]   DR-Net: An Improved Network for Building Extraction from High Resolution Remote Sensing Image [J].
Chen, Meng ;
Wu, Jianjun ;
Liu, Leizhen ;
Zhao, Wenhui ;
Tian, Feng ;
Shen, Qiu ;
Zhao, Bingyu ;
Du, Ruohua .
REMOTE SENSING, 2021, 13 (02) :1-19
[9]   Multiscale Feature Learning by Transformer for Building Extraction From Satellite Images [J].
Chen, Xin ;
Qiu, Chunping ;
Guo, Wenyue ;
Yu, Anzhu ;
Tong, Xiaochong ;
Schmitt, Michael .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
[10]  
Chu XX, 2021, ADV NEUR IN