Attention-Based Multi-Level Feature Fusion for Object Detection in Remote Sensing Images

被引:38
作者
Dong, Xiaohu [1 ]
Qin, Yao [2 ]
Gao, Yinghui [3 ]
Fu, Ruigang [1 ]
Liu, Songlin [4 ]
Ye, Yuanxin [5 ]
机构
[1] Natl Univ Def Technol, Coll Elect Sci, Changsha 410073, Peoples R China
[2] Northwest Inst Nucl Technol, Remote Sensing Lab, Xian 710024, Peoples R China
[3] Acad Mil Sci, Warfare Studies Inst, Beijing 100091, Peoples R China
[4] State Key Lab Geoinformat Engn, Xian 710024, Peoples R China
[5] Southwest Jiaotong Univ, Fac Geosci & Environm Engn, Chengdu 610031, Peoples R China
基金
中国国家自然科学基金;
关键词
object detection; remote sensing; deformable convolution; multi-level feature fusion; attention module; NETWORKS;
D O I
10.3390/rs14153735
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
We study the problem of object detection in remote sensing images. As a simple but effective feature extractor, Feature Pyramid Network (FPN) has been widely used in several generic vision tasks. However, it still faces some challenges when used for remote sensing object detection, as the objects in remote sensing images usually exhibit variable shapes, orientations, and sizes. To this end, we propose a dedicated object detector based on the FPN architecture to achieve accurate object detection in remote sensing images. Specifically, considering the variable shapes and orientations of remote sensing objects, we first replace the original lateral connections of FPN with Deformable Convolution Lateral Connection Modules (DCLCMs), each of which includes a 3 x 3 deformable convolution to generate feature maps with deformable receptive fields. Additionally, we further introduce several Attention-based Multi-Level Feature Fusion Modules (A-MLFFMs) to integrate the multi-level outputs of FPN adaptively, further enabling multi-scale object detection. Extensive experimental results on the DIOR dataset demonstrated the state-of-the-art performance achieved by the proposed method, with the highest mean Average Precision (mAP) of 73.6%.
引用
收藏
页数:19
相关论文
共 69 条
[1]   Social media and satellites: Disaster event detection, linking and summarization [J].
Ahmad, Kashif ;
Pogorelov, Konstantin ;
Riegler, Michael ;
Conci, Nicola ;
Halvorsen, Pal .
MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (03) :2837-2875
[2]  
[Anonymous], 2018, P IEEE C COMPUTER VI
[3]  
[Anonymous], 2016, Comput. Vis. Pattern Recogn.
[4]   Attention Augmented Convolutional Networks [J].
Bello, Irwan ;
Zoph, Barret ;
Vaswani, Ashish ;
Shlens, Jonathon ;
Le, Quoc V. .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :3285-3294
[5]  
Bochkovskiy A., 2020, ARXIV 200410934
[6]   Adaptive multi-level feature fusion and attention-based network for arbitrary-oriented object detection in remote sensing imagery [J].
Chen, Luchang ;
Liu, Chunsheng ;
Chang, Faliang ;
Li, Shuang ;
Nie, Zhaoying .
NEUROCOMPUTING, 2021, 451 :67-80
[7]   Vehicle Detection in Satellite Images by Hybrid Deep Convolutional Neural Networks [J].
Chen, Xueyun ;
Xiang, Shiming ;
Liu, Cheng-Lin ;
Pan, Chun-Hong .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2014, 11 (10) :1797-1801
[8]   Cross-Scale Feature Fusion for Object Detection in Optical Remote Sensing Images [J].
Cheng, Gong ;
Si, Yongjie ;
Hong, Hailong ;
Yao, Xiwen ;
Guo, Lei .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2021, 18 (03) :431-435
[9]   Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images [J].
Cheng, Gong ;
Zhou, Peicheng ;
Han, Junwei .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2016, 54 (12) :7405-7415
[10]   A survey on object detection in optical remote sensing images [J].
Cheng, Gong ;
Han, Junwei .
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2016, 117 :11-28