MFFSODNet: Multiscale Feature Fusion Small Object Detection Network for UAV Aerial Images

被引:55
作者
Jiang, Lingjie [1 ,2 ,3 ]
Yuan, Baoxi [1 ,2 ,3 ]
Du, Jiawei [1 ]
Chen, Boyu [4 ]
Xie, Hanfei [1 ,2 ,3 ]
Tian, Juan [5 ]
Yuan, Ziqi [6 ]
机构
[1] Xijing Univ, Sch Elect Informat, Xian 710123, Peoples R China
[2] Xijing Univ, Xian Key Lab High Precis Ind Intelligent Vis Measu, Xian 710123, Peoples R China
[3] Shaanxi Jiurui Technol Co Ltd, Xian 710065, Shaanxi, Peoples R China
[4] Air Force Engn Univ, Air Traff Control & Ground Control Intercept Coll, Xian 710038, Peoples R China
[5] Xijing Univ, Sch Humanities & Educ, Xian 710123, Peoples R China
[6] Minzu Univ China, Sch Econ, Beijing 100081, Peoples R China
关键词
Deep learning; feature pyramid network (FPN); multiscale feature extraction; small object detection; unmanned aerial vehicle (UAV) aerial image;
D O I
10.1109/TIM.2024.3381272
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Unmanned aerial vehicle (UAV) aerial image object detection is a valuable and challenging research field. Despite the breakthrough of deep learning-based object detection networks in natural scenes, UAV images often exhibit characteristics such as a high proportion of small objects, dense distribution, and significant variations in object scales, posing great challenges for accurate detection. To address these issues, we propose an innovative multiscale feature fusion small object detection network (MFFSODNet). First, concerning the high proportion of small objects in UAV images, an additional tiny object prediction head is introduced instead of the large object prediction head. This approach provides a good detection accuracy of small objects and significantly reduces the parameters. Second, to enhance the feature extraction capability of the network for fine-grained information from small objects, a multiscale feature extraction module (MSFEM) is designed, which could extract rich and valuable multiscale feature information through convolution operation of different scales on multiple branches. Third, to fuse the fine-grained information from shallow feature maps and the semantic information from deep feature maps, a new bidirectional dense feature pyramid network (BDFPN) is proposed. By expanding the feature pyramid network scale and introducing skip connections, BDFPN achieves efficient multiscale information fusion. Extensive experiments on the VisDrone and UAVDT benchmark datasets demonstrate that MFFSODNet outperforms the state-of-the-art object detection methods and further validate the effectiveness and generalization of MFFSODNet on photovoltaic array defect datasets (PVDs).
引用
收藏
页码:1 / 14
页数:14
相关论文
共 49 条
[1]   Detection of Abnormal Vibration Dampers on Transmission Lines in UAV Remote Sensing Images with PMA-YOLO [J].
Bao, Wenxia ;
Ren, Yangxun ;
Wang, Nian ;
Hu, Gensheng ;
Yang, Xianjun .
REMOTE SENSING, 2021, 13 (20)
[2]  
Bochkovskiy A, 2020, Arxiv, DOI [arXiv:2004.10934, 10.48550/arXiv.2004.10934]
[3]   Vehicle Detection From UAV Imagery With Deep Learning: A Review [J].
Bouguettaya, Abdelmalek ;
Zarzour, Hafed ;
Kechida, Ahmed ;
Taberkit, Amine Mohammed .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (11) :6047-6067
[4]   Cascade R-CNN: Delving into High Quality Object Detection [J].
Cai, Zhaowei ;
Vasconcelos, Nuno .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :6154-6162
[5]   RRNet: A Hybrid Detector for Object Detection in Drone-captured Images [J].
Chen, Changrui ;
Zhang, Yu ;
Lv, Qingxuan ;
Wei, Shuo ;
Wang, Xiaorui ;
Sun, Xin ;
Dong, Junyu .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, :100-108
[6]   YOLO-Based UAV Technology: A Review of the Research and Its Applications [J].
Chen, Chunling ;
Zheng, Ziyue ;
Xu, Tongyu ;
Guo, Shuang ;
Feng, Shuai ;
Yao, Weixiang ;
Lan, Yubin .
DRONES, 2023, 7 (03)
[7]  
Cheng BD, 2022, Arxiv, DOI arXiv:2207.04754
[8]   Vehicle Detection in Aerial Surveillance Using Dynamic Bayesian Networks [J].
Cheng, Hsu-Yung ;
Weng, Chih-Chia ;
Chen, Yi-Ying .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2012, 21 (04) :2152-2159
[9]   VisDrone-DET2019: The Vision Meets Drone Object Detection in Image Challenge Results [J].
Du, Dawei ;
Zhu, Pengfei ;
Wen, Longyin ;
Bian, Xiao ;
Ling, Haibin ;
Hu, Qinghua ;
Peng, Tao ;
Zheng, Jiayu ;
Wang, Xinyao ;
Zhang, Yue ;
Bo, Liefeng ;
Shi, Hailin ;
Zhu, Rui ;
Kumar, Aashish ;
Li, Aijin ;
Zinollayev, Almaz ;
Askergaliyev, Anuar ;
Schumann, Arne ;
Mao, Binjie ;
Lee, Byeongwon ;
Liu, Chang ;
Chen, Changrui ;
Pan, Chunhong ;
Huo, Chunlei ;
Yu, Da ;
Cong, Dechun ;
Zeng, Dening ;
Pailla, Dheeraj Reddy ;
Li, Di ;
Wang, Dong ;
Cho, Donghyeon ;
Zhang, Dongyu ;
Bai, Furui ;
Jose, George ;
Gao, Guangyu ;
Liu, Guizhong ;
Xiong, Haitao ;
Qi, Hao ;
Wang, Haoran ;
Qiu, Heqian ;
Li, Hongliang ;
Lu, Huchuan ;
Kim, Ildoo ;
Kim, Jaekyum ;
Shen, Jane ;
Lee, Jihoon ;
Ge, Jing ;
Xu, Jingjing ;
Zhou, Jingkai ;
Meier, Jonas .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, :213-226
[10]  
Ge Z, 2021, Arxiv, DOI [arXiv:2107.08430, 10.48550/arXiv.2107.08430]