Wavelet Transform Feature Enhancement for Semantic Segmentation of Remote Sensing Images

被引:13
作者
Li, Yifan [1 ]
Liu, Ziqian [1 ]
Yang, Junli [1 ]
Zhang, Haopeng [2 ,3 ,4 ]
机构
[1] Beijing Univ Posts & Telecommun, Int Sch, Beijing 100876, Peoples R China
[2] Beihang Univ, Sch Astronaut, Dept Aerosp Informat Engn, Beijing 102206, Peoples R China
[3] Beijing Key Lab Digital Media, Beijing 102206, Peoples R China
[4] Minist Educ, Key Lab Spacecraft Design Optimizat & Dynam Simula, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
discrete wavelet transform; remote sensing images; feature enhancement; semantic segmentation;
D O I
10.3390/rs15245644
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
With developments in deep learning, semantic segmentation of remote sensing images has made great progress. Currently, mainstream methods are based on convolutional neural networks (CNNs) or vision transformers. However, these methods are not very effective in extracting features from remote sensing images, which are usually of high resolution with plenty of detail. Operations including downsampling will cause the loss of such features. To address this problem, we propose a novel module called Hierarchical Wavelet Feature Enhancement (WFE). The WFE module involves three sequential steps: (1) performing multi-scale decomposition of an input image based on the discrete wavelet transform; (2) enhancing the high-frequency sub-bands of the input image; and (3) feeding them back to the corresponding layers of the network. Our module can be easily integrated into various existing CNNs and transformers, and does not require additional pre-training. We conducted experiments on the ISPRS Potsdam and ISPRS Vaihingen datasets, with results showing that our method improves the benchmarks of CNNs and transformers while performing little additional computation.
引用
收藏
页数:20
相关论文
共 35 条
[1]   Aerial LaneNet: Lane-Marking Semantic Segmentation in Aerial Imagery Using Wavelet-Enhanced Cost-Sensitive Symmetric Fully Convolutional Neural Networks [J].
Azimi, Seyed Majid ;
Fischer, Peter ;
Koerner, Marco ;
Reinartz, Peter .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2019, 57 (05) :2920-2938
[2]   MsanlfNet: Semantic Segmentation Network With Multiscale Attention and Nonlocal Filters for High-Resolution Remote Sensing Images [J].
Bai, Lin ;
Lin, Xiangyuan ;
Ye, Zhen ;
Xue, Dongling ;
Yao, Cheng ;
Hui, Meng .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
[3]  
Bo D., 2023, PROC AAAI C ARTIF IN, V37, P516
[4]  
Chen LC, 2017, Arxiv, DOI arXiv:1706.05587
[5]   CNN and Transformer Fusion for Remote Sensing Image Semantic Segmentation [J].
Chen, Xin ;
Li, Dongfen ;
Liu, Mingzhe ;
Jia, Jiaru .
REMOTE SENSING, 2023, 15 (18)
[6]   Masked-attention Mask Transformer for Universal Image Segmentation [J].
Cheng, Bowen ;
Misra, Ishan ;
Schwing, Alexander G. ;
Kirillov, Alexander ;
Girdhar, Rohit .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :1280-1289
[7]  
Dosovitskiy A., 2021, An image is worth 16x16 words: Transformers for image recognition at scale
[8]   Context Enhancing Representation for Semantic Segmentation in Remote Sensing Images [J].
Fang, Leyuan ;
Zhou, Peng ;
Liu, Xinxin ;
Ghamisi, Pedram ;
Chen, Siwei .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (03) :4138-4152
[9]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[10]  
Hu J, 2018, PROC CVPR IEEE, P7132, DOI [10.1109/TPAMI.2019.2913372, 10.1109/CVPR.2018.00745]