SPANet: Successive Pooling Attention Network for Semantic Segmentation of Remote Sensing Images

被引:59
作者
Sun, Le [1 ,2 ]
Cheng, Shiwei [1 ]
Zheng, Yuhui [1 ]
Wu, Zebin [3 ]
Zhang, Jianwei [4 ]
机构
[1] Nanjing Univ Informat Sci & Technol, Sch Comp Sci, Nanjing 210044, Peoples R China
[2] Nanjing Univ Informat Sci & Technol, Engn Res Ctr Digital Forens, Minist Educ, Nanjing 210044, Peoples R China
[3] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China
[4] Nanjing Univ Informat Sci & Technol, Sch Math & Stat, Nanjing 210044, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature extraction; Semantics; Image segmentation; Remote sensing; Data mining; Context modeling; Decoding; Attention mechanism; convolutional neural network; remote sensing images; semantic segmentation; successive pooling; MULTISCALE;
D O I
10.1109/JSTARS.2022.3175191
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In the convolutional neural network, the precise segmentation of small-scale objects and object boundaries in remote sensing images is a great challenge. As the model gets deeper, low-level features with geometric information and high-level features with semantic information cannot be obtained simultaneously. To alleviate this problem, a successive pooling attention network (SPANet) was proposed. The SPANet mainly consists of ResNet50 as the backbone, successive pooling attention module (SPAM), and feature fusion module (FFM). Specifically, the SPANet uses two parallel branches to extract high-level features by ResNet50 and low-level features by the first 11 layers of ResNet50. Then, both the high- and low-level features are fed to the SPAM, which is mainly composed of a successive pooling operator and a self-attention submodule, for further extracting deeper multiscale and salient features. In addition, the low- and high-level features after the SPAM are fused by the FFM to achieve the complementarity of spatial and geometric information. This fusion module alleviates the problem of the accurate segmentation of object edges. Finally, the high-level features and enhanced low-level features of the two branches are fused to obtain the final prediction results. Experiments show that the proposed SPANet achieves a good segmentation effect compared with other models on two remotely sensed datasets.
引用
收藏
页码:4045 / 4057
页数:13
相关论文
共 48 条
[1]   AgriSegNet: Deep Aerial Semantic Segmentation Framework for IoT-Assisted Precision Agriculture [J].
Anand, Tanmay ;
Sinha, Soumendu ;
Mandal, Murari ;
Chamola, Vinay ;
Yu, Fei Richard .
IEEE SENSORS JOURNAL, 2021, 21 (16) :17581-17590
[2]  
[Anonymous], 2015, INT C LEARN REPR
[3]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[4]  
Chaurasia A, 2017, 2017 IEEE VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP)
[5]   CaMap: Camera-based Map Manipulation on Mobile Devices [J].
Chen, Liang ;
Chen, Dongyi .
PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND APPLICATION ENGINEERING (CSAE2018), 2018,
[6]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[7]  
Chen LB, 2017, IEEE INT SYMP NANO, P1, DOI 10.1109/NANOARCH.2017.8053709
[8]   Comprehensive Semantic Segmentation on High Resolution UAV Imagery for Natural Disaster Damage Assessment [J].
Chowdhury, Tashnim ;
Rahnemoonfar, Maryam ;
Murphy, Robin ;
Fernandes, Odair .
2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, :3904-3913
[9]   SANet: A Sea-Land Segmentation Network Via Adaptive Multiscale Feature Learning [J].
Cui, Binge ;
Jing, Wei ;
Huang, Ling ;
Li, Zhongrui ;
Lu, Yan .
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2021, 14 :116-126
[10]   Semantic Segmentation of Large-Size VHR Remote Sensing Images Using a Two-Stage Multiscale Training Architecture [J].
Ding, Lei ;
Zhang, Jing ;
Bruzzone, Lorenzo .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2020, 58 (08) :5367-5376