Hybrid Network Model: TransConvNet for Oriented Object Detection in Remote Sensing Images

被引：23

作者：

Liu, Xulun ^{[1
]}

Ma, Shiping ^{[1
]}

He, Linyuan ^{[1
,2
]}

Wang, Chen ^{[1
]}

Chen, Zhe ^{[3
]}

机构：

[1] Air Force Engn Univ, Aviat Engn Sch, Xian 710038, Peoples R China

[2] Northwestern Polytech Univ, Unbanned Syst Res Inst, Xian 710072, Peoples R China

[3] Xian Univ Posts & Telecommun, Natl Engn Lab Wireless Secur, Xian 710121, Peoples R China

来源：

REMOTE SENSING | 2022年 / 14卷 / 09期

基金：

中国博士后科学基金; 中国国家自然科学基金;

关键词：

oriented object detection; remote sensing images; self-attention; transformer; feature fusion;

D O I：

10.3390/rs14092090

中图分类号：

X [环境科学、安全科学];

学科分类号：

08 ; 0830 ;

摘要：

The complexity of backgrounds, the diversity of object scale and orientation, and the defects of convolutional neural network (CNN) have always been the challenges of oriented object detection in remote sensing images (RSIs). This paper designs a hybrid network model to meet these challenges and further improve the effect of oriented object detection. The inductive bias of CNN makes the network translation invariant, but it is difficult to adapt to RSIs with arbitrary object direction. Therefore, this paper designs a hybrid network, TransConvNet, which integrates the advantages of CNN and self-attention-based network, pays more attention to the aggregation of global and local information, makes up for the lack of rotation invariability of CNN with strong contextual attention, and adapts to the arbitrariness of the object direction of RSIs. In addition, to resolve the influence of complex backgrounds and multi-scale, an adaptive feature fusion network (AFFN) is designed to improve the information representation ability of feature maps with different resolutions. Finally, the adaptive weight loss function is used to train the network to further improve the effect of object detection. Extensive experimental results on the DOTA, UCASAOD, and VEDAI data sets demonstrate the effectiveness of the proposed method.

引用

页数：17

共 40 条

[1]

Cai Z., 2016, P EUROPEAN C COMPUTE, P370

[2] End-to-End Object Detection with Transformers [J].

Carion, Nicolas ;

Massa, Francisco ;

Synnaeve, Gabriel ;

Usunier, Nicolas ;

Kirillov, Alexander ;

Zagoruyko, Sergey .

COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229

[3] Multi-Scale Spatial and Channel-wise Attention for Improving Object Detection in Remote Sensing Imagery [J].

Chen, Jie ;

Wan, Li ;

Zhu, Jingru ;

Xu, Gang ;

Deng, Min .

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2020, 17 (04) :681-685

[4] ConViT: improving vision transformers with soft convolutional inductive biases [J].

d'Ascoli, Stephane ;

Touvron, Hugo ;

Leavitt, Matthew L. ;

Morcos, Ari S. ;

Biroli, Giulio ;

Sagun, Levent .

JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2022, 2022 (11)

[5] Learning RoI Transformer for Oriented Object Detection in Aerial Images [J].

Ding, Jian ;

Xue, Nan ;

Long, Yang ;

Xia, Gui-Song ;

Lu, Qikai .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :2844-2853

[6]

Dosovitskiy A, 2020, ARXIV

[7]

Girshick R., 2015, h region proposal networks, DOI DOI 10.1109/TPAMI.2016.2577031

[8] CAD-Net: A Context-Aware Detection Network for Objects in Remote Sensing Imagery [J].

Zhang, Gongjie ;

Lu, Shijian ;

Zhang, Wei .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2019, 57 (12) :10015-10024

[9]

He K, 2016, C COMPUTER VISION PA, P770

[10]

Howard A. G., 2017, arXiv

← 1 2 3 4 →