Hybrid Network Model: TransConvNet for Oriented Object Detection in Remote Sensing Images

被引:20
作者
Liu, Xulun [1 ]
Ma, Shiping [1 ]
He, Linyuan [1 ,2 ]
Wang, Chen [1 ]
Chen, Zhe [3 ]
机构
[1] Air Force Engn Univ, Aviat Engn Sch, Xian 710038, Peoples R China
[2] Northwestern Polytech Univ, Unbanned Syst Res Inst, Xian 710072, Peoples R China
[3] Xian Univ Posts & Telecommun, Natl Engn Lab Wireless Secur, Xian 710121, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
oriented object detection; remote sensing images; self-attention; transformer; feature fusion;
D O I
10.3390/rs14092090
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
The complexity of backgrounds, the diversity of object scale and orientation, and the defects of convolutional neural network (CNN) have always been the challenges of oriented object detection in remote sensing images (RSIs). This paper designs a hybrid network model to meet these challenges and further improve the effect of oriented object detection. The inductive bias of CNN makes the network translation invariant, but it is difficult to adapt to RSIs with arbitrary object direction. Therefore, this paper designs a hybrid network, TransConvNet, which integrates the advantages of CNN and self-attention-based network, pays more attention to the aggregation of global and local information, makes up for the lack of rotation invariability of CNN with strong contextual attention, and adapts to the arbitrariness of the object direction of RSIs. In addition, to resolve the influence of complex backgrounds and multi-scale, an adaptive feature fusion network (AFFN) is designed to improve the information representation ability of feature maps with different resolutions. Finally, the adaptive weight loss function is used to train the network to further improve the effect of object detection. Extensive experimental results on the DOTA, UCASAOD, and VEDAI data sets demonstrate the effectiveness of the proposed method.
引用
收藏
页数:17
相关论文
共 40 条
  • [1] Cai Z., 2016, P EUROPEAN C COMPUTE, P370
  • [2] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
  • [3] Multi-Scale Spatial and Channel-wise Attention for Improving Object Detection in Remote Sensing Imagery
    Chen, Jie
    Wan, Li
    Zhu, Jingru
    Xu, Gang
    Deng, Min
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2020, 17 (04) : 681 - 685
  • [4] ConViT: improving vision transformers with soft convolutional inductive biases
    d'Ascoli, Stephane
    Touvron, Hugo
    Leavitt, Matthew L.
    Morcos, Ari S.
    Biroli, Giulio
    Sagun, Levent
    [J]. JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2022, 2022 (11):
  • [5] Learning RoI Transformer for Oriented Object Detection in Aerial Images
    Ding, Jian
    Xue, Nan
    Long, Yang
    Xia, Gui-Song
    Lu, Qikai
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 2844 - 2853
  • [6] Dosovitskiy A., 2020, INT C LEARN REPR
  • [7] CAD-Net: A Context-Aware Detection Network for Objects in Remote Sensing Imagery
    Zhang, Gongjie
    Lu, Shijian
    Zhang, Wei
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2019, 57 (12): : 10015 - 10024
  • [8] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
  • [9] Howard AG., 2017, ARXIV, DOI DOI 10.48550/ARXIV.1704.04861
  • [10] Hu J., 2018, PROC IEEE C COMPUT V, P7132