Cascaded CNN and global-local attention transformer network-based semantic segmentation for high-resolution remote sensing image

被引:1
作者
Liu, Xiaohui [1 ]
Zhang, Lei [1 ]
Wang, Rui [2 ]
Li, Xiaoyu [2 ]
Xu, Jiyang [2 ]
Lu, Xiaochen [1 ]
机构
[1] Donghua Univ, Coll Informat Sci & Technol, Shanghai, Peoples R China
[2] Shanghai AllyNav Technol Co, Shanghai, Peoples R China
基金
上海市自然科学基金;
关键词
high-resolution remote sensing images; semantic segmentation; convolution neural network; transformer; global-local attention transformer block; multilevel channel attention integration block;
D O I
10.1117/1.JRS.18.034502
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
High-resolution remote sensing images (HRRSIs) contain rich local spatial information and long-distance location dependence, which play an important role in semantic segmentation tasks and have received more and more research attention. However, HRRSIs often exhibit large intraclass variance and small interclass variance due to the diversity and complexity of ground objects, thereby bringing great challenges to a semantic segmentation task. In most networks, there are numerous small-scale object omissions and large-scale object fragmentations in the segmentation results because of insufficient local feature extraction and low global information utilization. A network cascaded by convolution neural network and global-local attention transformer is proposed called CNN-transformer cascade network. First, convolution blocks and global-local attention transformer blocks are used to extract multiscale local features and long-range location information, respectively. Then a multilevel channel attention integration block is designed to fuse geometric features and semantic features of different depths and revise the channel weights through the channel attention module to resist the interference of redundant information. Finally, the smoothness of the segmentation is improved through the implementation of upsampling using a deconvolution operation. We compare our method with several state-of-the-art methods on the ISPRS Vaihingen and Potsdam datasets. Experimental results show that our method can improve the integrity and independence of multiscale objects segmentation results.
引用
收藏
页数:19
相关论文
共 48 条
[1]  
Cao Hu, 2023, Computer Vision - ECCV 2022 Workshops: Proceedings. Lecture Notes in Computer Science (13803), P205, DOI 10.1007/978-3-031-25066-8_9
[2]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[3]   DLMP-Net: A Dynamic Yet Lightweight Multi-pyramid Network for Crowd Density Estimation [J].
Chen, Qi ;
Lei, Tao ;
Geng, Xinzhe ;
Liu, Hulin ;
Gao, Yangyi ;
Zhao, Weiqiang ;
Nandi, Asoke .
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2022, PT IV, 2022, 13537 :27-39
[4]   Hybrid Attention Fusion Embedded in Transformer for Remote Sensing Image Semantic Segmentation [J].
Chen, Yan ;
Dong, Quan ;
Wang, Xiaofeng ;
Zhang, Qianchuan ;
Kang, Menglei ;
Jiang, Wenxiang ;
Wang, Mengyuan ;
Xu, Lixiang ;
Zhang, Chen .
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 :4421-4435
[5]   CCANet: Class-Constraint Coarse-to-Fine Attentional Deep Network for Subdecimeter Aerial Image Semantic Segmentation [J].
Deng, Guohui ;
Wu, Zhaocong ;
Wang, Chengjun ;
Xu, Miaozhong ;
Zhong, Yanfei .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[6]  
Duan PH, 2021, IEEE T GEOSCI REMOTE, V59, P7726, DOI [10.1109/tia.2020.3010899, 10.1109/TGRS.2020.3031928]
[7]   Context Enhancing Representation for Semantic Segmentation in Remote Sensing Images [J].
Fang, Leyuan ;
Zhou, Peng ;
Liu, Xinxin ;
Ghamisi, Pedram ;
Chen, Siwei .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (03) :4138-4152
[8]   Dual Attention Network for Scene Segmentation [J].
Fu, Jun ;
Liu, Jing ;
Tian, Haijie ;
Li, Yong ;
Bao, Yongjun ;
Fang, Zhiwei ;
Lu, Hanqing .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3141-3149
[9]   DSHNet: A Semantic Segmentation Model of Remote Sensing Images Based on Dual Stream Hybrid Network [J].
Fu, Yujia ;
Zhang, Xiangrong ;
Wang, Mingyang .
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 :4164-4175
[10]  
He GJ, 2023, IEEE GEOSCI REMOTE S, V20, DOI [10.1109/LGRS.2023.3233979, 10.1109/LGRS.2023.3312589]