Channel Self-Attention Based Multiscale Spatial-Frequency Domain Network for Oriented Object Detection in Remote Sensing Imagery

被引:0
|
作者
Xu, Yang [1 ]
Pan, Yushan [1 ]
Wu, Zebin [1 ]
Wei, Zhihui [1 ]
Zhan, Tianming [2 ,3 ]
机构
[1] Nanjing Univ Sci & Technol NJUST, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China
[2] Nanjing Audit Univ, Jiangsu Key Construct Lab Audit Informat Engn, Nanjing 211815, Peoples R China
[3] Nanjing Audit Univ, Sch Informat Engn, Nanjing 211815, Peoples R China
来源
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2024年 / 62卷
基金
中国国家自然科学基金;
关键词
Feature extraction; Frequency-domain analysis; Detectors; Remote sensing; Object detection; Data mining; Attention mechanisms; Wavelet transforms; Convolution; Semantics; Fusion features; Haar wavelet transform; oriented object detection; remote sensing imagery; spatial-frequency domain;
D O I
10.1109/TGRS.2024.3500013
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
The detection of oriented objects in remote sensing images remains a daunting challenge due to their complex backgrounds, various sizes, and especially arbitrary orientations. However, most of the existing methods only model the structural features of the images in the spatial domain, while the horizontal convolution kernels limit the model's ability to perceive object direction information. Furthermore, the frequency features contain rich information about scale, texture, and angle, which can be a good complement to the spatial features. Inspired by this, we propose a multiscale spatial-frequency domain network (MSFN) to utilize spatial-frequency information for oriented object detection, which can be integrated into any convolutional neural network (CNN) architectures seamlessly and perform end-to-end training easily. Firstly, multiscale Haar wavelet transforms are leveraged to extract the multiscale frequency domain features from the image. Subsequently, channel alignment feature fusion module (CA-FFM) is proposed to fuse the high-level semantic features extracted by CNN with the low-level texture features extracted by the wavelet transform in multiscale. Finally, a channel self-attention (CSA)-based spatial-frequency feature perception module (SFPM) is designed to perform self-attention weighted aggregation on the fused features along the channel dimension, thereby constructing a novel spatial-frequency feature extraction backbone network for oriented object detector in remote sensing images. Experimental results on the DOTA and HRSC2016 datasets validate the effectiveness and universality of the proposed method.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] Shadow Detection in Remote Sensing Images Based on Multibranch Feature Aggregation and Channel-Spatial Attention
    Chang, Xueli
    Shi, Haiyang
    Zhang, Tiejun
    Jin, Huazhong
    Xu, Ao
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2025, 18 : 2618 - 2630
  • [32] Adaptively Attentional Feature Fusion Oriented to Multiscale Object Detection in Remote Sensing Images
    Zhao, Wenqing
    Kang, Yijin
    Chen, Hao
    Zhao, Zhenhuan
    Zhao, Zhenbing
    Zhai, Yongjie
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
  • [33] An Explainable Spatial-Frequency Multiscale Transformer for Remote Sensing Scene Classification
    Yang, Yuting
    Jiao, Licheng
    Liu, Fang
    Liu, Xu
    Li, Lingling
    Chen, Puhua
    Yang, Shuyuan
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [34] Self-Attention Guidance and Multiscale Feature Fusion-Based UAV Image Object Detection
    Zhang, Yunzuo
    Wu, Cunyu
    Zhang, Tian
    Liu, Yameng
    Zheng, Yuxin
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
  • [35] SAR Image Change Detection in Spatial-Frequency Domain Based on Attention Mechanism and Gated Linear Unit
    Zhao, Chunhui
    Ma, Lirui
    Wang, Lu
    Ohtsuki, Tomoaki
    Mathiopoulos, P. Takis
    Wang, Yong
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
  • [36] Multiscale Semantic Guidance Network for Object Detection in VHR Remote Sensing Images
    Zhu, Shengyu
    Zhang, Junping
    Liang, Xuejian
    Guo, Qingle
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [37] Object Detection Based on Efficient Multiscale Auto-Inference in Remote Sensing Images
    Zhang, Shaojing
    Mu, Xiaodong
    Kou, Guangjie
    Zhao, Jingyu
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2021, 18 (09) : 1650 - 1654
  • [38] Rotation Equivariant Feature Image Pyramid Network for Object Detection in Optical Remote Sensing Imagery
    Shamsolmoali, Pourya
    Zareapoor, Masoumeh
    Chanussot, Jocelyn
    Zhou, Huiyu
    Yang, Jie
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [39] Stepwise Locating Bidirectional Pyramid Network for Object Detection in Remote Sensing Imagery
    Yu, Nanjing
    Ren, Haohao
    Deng, Tianmin
    Fan, Xiaobiao
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
  • [40] Stepwise Locating Bidirectional Pyramid Network for Object Detection in Remote Sensing Imagery
    Yu, Nanjing
    Ren, Haohao
    Deng, Tianmin
    Fan, Xiaobiao
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20