Multi-scale feature fusion with knowledge distillation for object detection in aerial imagery

被引:0
作者
Li, Mengyuan [1 ]
Liang, Xingzhu [1 ,2 ]
Hu, Qicheng [1 ]
Lin, Yu-e [1 ]
Xia, Chenxing [1 ]
机构
[1] Anhui Univ Sci & Technol, Sch Comp Sci & Engn, Huainan 232001, Anhui, Peoples R China
[2] Anhui Univ Sci & Technol, Huainan Peoples Hosp 1, Affiliated Hosp 1, Huainan 232007, Anhui, Peoples R China
基金
中国国家自然科学基金;
关键词
Self-attention; Feature fusion; Knowledge distillation; Aerial imagery;
D O I
10.1016/j.engappai.2025.111518
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Object detection in aerial images faces problems such as small and dense objects as well as occlusion. Existing methods usually adopt a fusion structure similar to a feature pyramid network (FPN-like), which introduces a small number of parameters but has a low detection accuracy. Therefore, to effectively balance the parameters of the model and the detection accuracy, this paper proposes a multi-scale feature fusion with knowledge distillation for object detection in aerial imagery (MFF-KD), which comprises a multi-scale feature fusion (MFF) network and channel and spatial attention knowledge distillation (CSAKD). First, we design a scale-aware feature fusion module (SAFFNet) in the MFF network, which is capable of reducing the model complexity while preserving more features of small objects. Secondly, we design an efficient multi-scale self-attention module (EMSA) that is integrated into the deep feature extraction process of the MFF network to capture continuous features of occluded objects. Finally, we propose a CSAKD knowledge distillation method, which enhances both foreground and effective background information in the student model, thereby improving the detection accuracy of the MFF network while controlling the growth of model parameters. We conducted extensive experiments on three publicly available aerial image datasets to validate the effectiveness of our method. The experimental results show that our method achieves good detection results on all three aerial image datasets. In addition, the parameter count of our method is 10.5M, and the real-time detection speed is 111 frames per second (FPS).
引用
收藏
页数:17
相关论文
共 53 条
[11]  
Hinton G, 2015, Arxiv, DOI [arXiv:1503.02531, 10.48550/arXiv.1503.02531, DOI 10.48550/ARXIV.1503.02531]
[12]  
Hu J, 2018, PROC CVPR IEEE, P7132, DOI [10.1109/CVPR.2018.00745, 10.1109/TPAMI.2019.2913372]
[13]   Density Map Guided Object Detection in Aerial Images [J].
Li, Changlin ;
Yang, Taojiannan ;
Zhu, Sijie ;
Chen, Chen ;
Guan, Shanyue .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, :737-746
[14]   Object detection in optical remote sensing images: A survey and a new benchmark [J].
Li, Ke ;
Wan, Gang ;
Cheng, Gong ;
Meng, Liqiu ;
Han, Junwei .
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2020, 159 :296-307
[15]   Feature Pyramid Networks for Object Detection [J].
Lin, Tsung-Yi ;
Dollar, Piotr ;
Girshick, Ross ;
He, Kaiming ;
Hariharan, Bharath ;
Belongie, Serge .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :936-944
[16]   YOLC: You Only Look Clusters for Tiny Object Detection in Aerial Images [J].
Liu, Chenguang ;
Gao, Guangshuai ;
Huang, Ziyue ;
Hu, Zhenghui ;
Liu, Qingjie ;
Wang, Yunhong .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (10) :13863-13875
[17]   Path Aggregation Network for Instance Segmentation [J].
Liu, Shu ;
Qi, Lu ;
Qin, Haifang ;
Shi, Jianping ;
Jia, Jiaya .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :8759-8768
[18]   SSD: Single Shot MultiBox Detector [J].
Liu, Wei ;
Anguelov, Dragomir ;
Erhan, Dumitru ;
Szegedy, Christian ;
Reed, Scott ;
Fu, Cheng-Yang ;
Berg, Alexander C. .
COMPUTER VISION - ECCV 2016, PT I, 2016, 9905 :21-37
[19]   SINGLE-SHOT BALANCED DETECTOR FOR GEOSPATIAL OBJECT DETECTION [J].
Liu, Yanfeng ;
Li, Qiang ;
Yuan, Yuan ;
Wang, Qi .
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, :2529-2533
[20]   Learning to Aggregate Multi-Scale Context for Instance Segmentation in Remote Sensing Images [J].
Liu, Ye ;
Li, Huifang ;
Hu, Chao ;
Luo, Shuang ;
Luo, Yan ;
Chen, Chang Wen .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (01) :595-609