Multi-Branch Parallel Networks for Object Detection in High-Resolution UAV Remote Sensing Images

被引:8
作者
Wu, Qihong [1 ]
Zhang, Bin [1 ]
Guo, Chang [1 ]
Wang, Lei [1 ]
机构
[1] Wuhan Inst Technol, Hubei Prov Key Lab Intelligent Robot, Wuhan 430205, Peoples R China
关键词
UAV remote sensing images; object detection; self-attention; sampling; feature fusion;
D O I
10.3390/drones7070439
中图分类号
TP7 [遥感技术];
学科分类号
081102 ; 0816 ; 081602 ; 083002 ; 1404 ;
摘要
Uncrewed Aerial Vehicles (UAVs) are instrumental in advancing the field of remote sensing. Nevertheless, the complexity of the background and the dense distribution of objects both present considerable challenges for object detection in UAV remote sensing images. This paper proposes a Multi-Branch Parallel Network (MBPN) based on the ViTDet (Visual Transformer for Object Detection) model, which aims to improve object detection accuracy in UAV remote sensing images. Initially, the discriminative ability of the input feature map of the Feature Pyramid Network (FPN) is improved by incorporating the Receptive Field Enhancement (RFE) and Convolutional Self-Attention (CSA) modules. Subsequently, to mitigate the loss of semantic information, the sampling process of the FPN is replaced by Multi-Branch Upsampling (MBUS) and Multi-Branch Downsampling (MBDS) modules. Lastly, a Feature-Concatenating Fusion (FCF) module is employed to merge feature maps of varying levels, thereby addressing the issue of semantic misalignment. This paper evaluates the performance of the proposed model on both a custom UAV-captured WCH dataset and the publicly available NWPU VHR10 dataset. The experimental results demonstrate that the proposed model achieves an increase in AP(L) of 2.4% and 0.7% on the WCH and NWPU VHR10 datasets, respectively, compared to the baseline model ViTDet-B.
引用
收藏
页数:17
相关论文
共 38 条
[1]   Cascade R-CNN: High Quality Object Detection and Instance Segmentation [J].
Cai, Zhaowei ;
Vasconcelos, Nuno .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (05) :1483-1498
[2]   End-to-End Object Detection with Transformers [J].
Carion, Nicolas ;
Massa, Francisco ;
Synnaeve, Gabriel ;
Usunier, Nicolas ;
Kirillov, Alexander ;
Zagoruyko, Sergey .
COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229
[3]   Parallel Residual Bi-Fusion Feature Pyramid Network for Accurate Single-Shot Object Detection [J].
Chen, Ping-Yang ;
Chang, Ming-Ching ;
Hsieh, Jun-Wei ;
Chen, Yong-Sheng .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :9099-9111
[4]   Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images [J].
Cheng, Gong ;
Zhou, Peicheng ;
Han, Junwei .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2016, 54 (12) :7405-7415
[5]   A survey on object detection in optical remote sensing images [J].
Cheng, Gong ;
Han, Junwei .
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2016, 117 :11-28
[6]   Multi-class geospatial object detection and geographic image classification based on collection of part detectors [J].
Cheng, Gong ;
Han, Junwei ;
Zhou, Peicheng ;
Guo, Lei .
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2014, 98 :119-132
[7]   Attention-Based Multi-Level Feature Fusion for Object Detection in Remote Sensing Images [J].
Dong, Xiaohu ;
Qin, Yao ;
Gao, Yinghui ;
Fu, Ruigang ;
Liu, Songlin ;
Ye, Yuanxin .
REMOTE SENSING, 2022, 14 (15)
[8]   Multiscale Deformable Attention and Multilevel Features Aggregation for Remote Sensing Object Detection [J].
Dong, Xiaohu ;
Qin, Yao ;
Fu, Ruigang ;
Gao, Yinghui ;
Liu, Songlin ;
Ye, Yuanxin ;
Li, Biao .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
[9]   AugFPN: Improving Multi-scale Feature Learning for Object Detection [J].
Guo, Chaoxu ;
Fan, Bin ;
Zhang, Qian ;
Xiang, Shiming ;
Pan, Chunhong .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :12592-12601
[10]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778