Multi-scale feature fusion with knowledge distillation for object detection in aerial imagery

被引:0
作者
Li, Mengyuan [1 ]
Liang, Xingzhu [1 ,2 ]
Hu, Qicheng [1 ]
Lin, Yu-e [1 ]
Xia, Chenxing [1 ]
机构
[1] Anhui Univ Sci & Technol, Sch Comp Sci & Engn, Huainan 232001, Anhui, Peoples R China
[2] Anhui Univ Sci & Technol, Huainan Peoples Hosp 1, Affiliated Hosp 1, Huainan 232007, Anhui, Peoples R China
基金
中国国家自然科学基金;
关键词
Self-attention; Feature fusion; Knowledge distillation; Aerial imagery;
D O I
10.1016/j.engappai.2025.111518
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Object detection in aerial images faces problems such as small and dense objects as well as occlusion. Existing methods usually adopt a fusion structure similar to a feature pyramid network (FPN-like), which introduces a small number of parameters but has a low detection accuracy. Therefore, to effectively balance the parameters of the model and the detection accuracy, this paper proposes a multi-scale feature fusion with knowledge distillation for object detection in aerial imagery (MFF-KD), which comprises a multi-scale feature fusion (MFF) network and channel and spatial attention knowledge distillation (CSAKD). First, we design a scale-aware feature fusion module (SAFFNet) in the MFF network, which is capable of reducing the model complexity while preserving more features of small objects. Secondly, we design an efficient multi-scale self-attention module (EMSA) that is integrated into the deep feature extraction process of the MFF network to capture continuous features of occluded objects. Finally, we propose a CSAKD knowledge distillation method, which enhances both foreground and effective background information in the student model, thereby improving the detection accuracy of the MFF network while controlling the growth of model parameters. We conducted extensive experiments on three publicly available aerial image datasets to validate the effectiveness of our method. The experimental results show that our method achieves good detection results on all three aerial image datasets. In addition, the parameter count of our method is 10.5M, and the real-time detection speed is 111 frames per second (FPS).
引用
收藏
页数:17
相关论文
共 53 条
[1]   Cascade R-CNN: Delving into High Quality Object Detection [J].
Cai, Zhaowei ;
Vasconcelos, Nuno .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :6154-6162
[2]   VisDrone-DET2021: The Vision Meets Drone Object detection Challenge Results [J].
Cao, Yaru ;
He, Zhijian ;
Wang, Lujia ;
Wang, Wenguan ;
Yuan, Yixuan ;
Zhang, Dingwen ;
Zhang, Jinglin ;
Zhu, Pengfei ;
Van Gool, Luc ;
Han, Junwei ;
Hoi, Steven ;
Hu, Qinghua ;
Liu, Ming ;
Cheng, Chong ;
Liu, Fanfan ;
Cao, Guojin ;
Li, Guozhen ;
Wang, Hongkai ;
He, Jianye ;
Wan, Junfeng ;
Wan, Qi ;
Zhao, Qi ;
Lyu, Shuchang ;
Zhao, Wenzhe ;
Lu, Xiaoqiang ;
Zhu, Xingkui ;
Liu, Yingjie ;
Lv, Yixuan ;
Ma, Yujing ;
Yang, Yuting ;
Wang, Zhe ;
Xu, Zhenyu ;
Luo, Zhipeng ;
Zhang, Zhimin ;
Zhang, Zhiguang ;
Li, Zihao ;
Zhang, Zixiao .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, :2847-2854
[3]   Research on edge intelligent recognition method oriented to transmission line insulator fault detection [J].
Deng, Fangming ;
Xie, Zhongxin ;
Mao, Wei ;
Li, Bing ;
Shan, Yun ;
Wei, Baoquan ;
Zeng, Han .
INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2022, 139
[4]   EL-Net: An efficient and lightweight optimized network for object detection in remote sensing images [J].
Dong, Chao ;
Jiang, Xiangkui ;
Hu, Yihui ;
Du, Yaoyao ;
Pan, Libing .
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 255
[5]   Adaptive Sparse Convolutional Networks with Global Context Enhancement for Faster Object Detection on Drone Images [J].
Du, Bowei ;
Huang, Yecheng ;
Chen, Jiaxin ;
Huang, Di .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :13435-13444
[6]   The Unmanned Aerial Vehicle Benchmark: Object Detection and Tracking [J].
Du, Dawei ;
Qi, Yuankai ;
Yu, Hongyang ;
Yang, Yifan ;
Duan, Kaiwen ;
Li, Guorong ;
Zhang, Weigang ;
Huang, Qingming ;
Tian, Qi .
COMPUTER VISION - ECCV 2018, PT X, 2018, 11214 :375-391
[7]   Coarse-grained Density Map Guided Object Detection in Aerial Images [J].
Duan, Chengzhen ;
Wei, Zhiwei ;
Zhang, Chi ;
Qu, Siying ;
Wang, Hongpeng .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, :2789-2798
[8]   Attention-Free Global Multiscale Fusion Network for Remote Sensing Object Detection [J].
Gao, Tao ;
Li, Ziqi ;
Wen, Yuanbo ;
Chen, Ting ;
Niu, Qianqian ;
Liu, Zixiang .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 :1-14
[9]   Fast R-CNN [J].
Girshick, Ross .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1440-1448
[10]   Multiscale leapfrog structure: An efficient object detector architecture designed for unmanned aerial vehicles [J].
Gong, Lixiong ;
Huang, Xiao ;
Chen, Jialin ;
Xiao, Miaoling ;
Chao, Yinkang .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 127