CCTNet: CNN and Cross-Shaped Transformer Hybrid Network for Remote Sensing Image Semantic Segmentation

被引:0
作者
Wu, Honglin [1 ]
Zeng, Zhaobin [1 ]
Huang, Peng [1 ]
Yu, Xinyu [1 ]
Zhang, Min [1 ]
机构
[1] Changsha Univ Sci & Technol, Sch Comp & Commun Engn, Changsha 410114, Peoples R China
关键词
Transformers; Feature extraction; Semantic segmentation; Semantics; Remote sensing; Convolutional neural networks; Decoding; Computational efficiency; Data mining; Computer architecture; Convolutional neural network (CNN); cross-shaped transformer; global contextual information; remote sensing image; semantic segmentation; CLASSIFIER;
D O I
10.1109/JSTARS.2024.3487003
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Deep learning methods have achieved great success in the field of remote sensing image segmentation in recent years, but building a lightweight segmentation model with comprehensive local and global feature extraction capabilities remains a challenging task. In this article, we propose a convolutional neural network (CNN) and cross-shaped transformer hybrid network (CCTNet) for semantic segmentation of high-resolution remote sensing images. This model follows an encoder-decoder structure. It employs ResNet18 as an encoder to extract hierarchical feature information, and constructs a transformer decoder based on efficient cross-shaped self-attention to fully model local and global feature information and achieve lightweighting of the network. Moreover, the transformer block introduces a mixed-scale convolutional feedforward network to further enhance multiscale information extraction. Furthermore, a simplified and efficient feature aggregation module is leveraged to gradually aggregate local and global information at different stages. Extensive comparison experiments on the ISPRS Vaihingen and Potsdam datasets reveal that our method obtains superior performance compared with state-of-the-art lightweight methods.
引用
收藏
页码:19986 / 19997
页数:12
相关论文
共 50 条
[41]   DSViT: Dynamically Scalable Vision Transformer for Remote Sensing Image Segmentation and Classification [J].
Wang, Falin ;
Ji, Jian ;
Wang, Yuan .
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2023, 16 :5441-5452
[42]   ConvFormer-CD: Hybrid CNN-Transformer With Temporal Attention for Detecting Changes in Remote Sensing Imagery [J].
Yang, Feng ;
Li, Mengtao ;
Shu, Wenqiang ;
Qin, Anyong ;
Song, Tiecheng ;
Gao, Chenqiang ;
Xia, Gui-Song .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
[43]   Semisupervised Multiscale Generative Adversarial Network for Semantic Segmentation of Remote Sensing Image [J].
Wang, Jiaqi ;
Liu, Bing ;
Zhou, Yong ;
Zhao, Jiaqi ;
Xia, Shixiong ;
Yang, Yuancan ;
Zhang, Man ;
Ming, Liu Ming .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
[44]   HBSeNet: A Hybrid Bilateral Network for Accurate Semantic Segmentation of Remote Sensing Images [J].
Huynh-The, Thien ;
Truong, Son Ngoc ;
Nguyen, Gia-Vuong .
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 :14179-14193
[45]   CADFormer: Fine-Grained Cross-Modal Alignment and Decoding Transformer for Referring Remote Sensing Image Segmentation [J].
Liu, Maofu ;
Jiang, Xin ;
Zhang, Xiaokang .
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2025, 18 :14557-14569
[46]   Hybrid Shunted Transformer embedding UNet for remote sensing image semantic segmentation [J].
Zhou H. ;
Xiao X. ;
Li H. ;
Liu X. ;
Liang P. .
Neural Computing and Applications, 2024, 36 (25) :15705-15720
[47]   SPANet: Successive Pooling Attention Network for Semantic Segmentation of Remote Sensing Images [J].
Sun, Le ;
Cheng, Shiwei ;
Zheng, Yuhui ;
Wu, Zebin ;
Zhang, Jianwei .
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2022, 15 :4045-4057
[48]   A Novel Transformer Based Semantic Segmentation Scheme for Fine-Resolution Remote Sensing Images [J].
Wang, Libo ;
Li, Rui ;
Duan, Chenxi ;
Zhang, Ce ;
Meng, Xiaoliang ;
Fang, Shenghui .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
[49]   SSDT: Scale-Separation Semantic Decoupled Transformer for Semantic Segmentation of Remote Sensing Images [J].
Zheng, Chengyu ;
Jiang, Yanru ;
Lv, Xiaowei ;
Nie, Jie ;
Liang, Xinyue ;
Wei, Zhiqiang .
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 :9037-9052
[50]   AAFormer: Attention-Attended Transformer for Semantic Segmentation of Remote Sensing Images [J].
Li, Xin ;
Xu, Feng ;
Li, Linyang ;
Xu, Nan ;
Liu, Fan ;
Yuan, Chi ;
Chen, Ziqi ;
Lyu, Xin .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21 :1-5