SF-YOLO: A Novel YOLO Framework for Small Object Detection in Aerial Scenes

被引:1
作者
Sun, Meng [1 ,2 ]
Wang, Le [1 ,2 ,3 ]
Jiang, Wangyu [1 ,2 ]
Dharejo, Fayaz Ali [4 ,5 ]
Mao, Guojun [1 ,2 ,3 ]
Timofte, Radu [4 ,5 ]
机构
[1] Fujian Univ Technol, Coll Comp, Fujian Prov Key Lab Big Data Min & Applicat, Fuzhou, Peoples R China
[2] Fujian Univ Technol, Sch Comp Sci & Math, Fuzhou, Peoples R China
[3] Fujian Univ Technol, Technol Innovat Ctr Factored Transact Data Tourist, Minist Culture & Tourism, Fuzhou, Peoples R China
[4] Univ Wurzburg, Comp Vis Lab, CAIDAS, Wurzburg, Germany
[5] Univ Wurzburg, IFI, Wurzburg, Germany
关键词
computer vision; convolutional neural nets; convolution; feature extraction; object detection; MODEL;
D O I
10.1049/ipr2.70027
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Object detection models are widely applied in the fields such as video surveillance and unmanned aerial vehicles to enable the identification and monitoring of various objects on a diversity of backgrounds. The general CNN-based object detectors primarily rely on downsampling and pooling operations, often struggling with small objects that have low resolution and failing to fully leverage contextual information that can differentiate objects from complex background. To address the problems, we propose a novel YOLO framework called SF-YOLO for small object detection. Firstly, we present a spatial information perception (SIP) module to extract contextual features for different objects through the integration of space to depth operation and large selective kernel module, which dynamically adjusts receptive field of the backbone and obtains the enhanced features for richer understanding of differentiation between objects and background. Furthermore, we design a novel multi-scale feature weighted fusion strategy, which performs weighted fusion on feature maps by combining fast normalized fusion method and CARAFE operation, accurately assessing the importance of each feature and enhancing the representation of small objects. The extensive experiments conducted on VisDrone2019, Tiny-Person and PESMOD datasets demonstrate that our proposed method enables comparable detection performance to state-of-the-art detectors.
引用
收藏
页数:14
相关论文
共 60 条
[1]   YOLOv4-5D: An Effective and Efficient Object Detector for Autonomous Driving [J].
Cai, Yingfeng ;
Luan, Tianyu ;
Gao, Hongbo ;
Wang, Hai ;
Chen, Long ;
Li, Yicheng ;
Sotelo, Miguel Angel ;
Li, Zhixiong .
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2021, 70
[2]  
Chen YT, 2019, CHIN CONT DECIS CONF, P4610, DOI [10.1109/ccdc.2019.8832735, 10.1109/CCDC.2019.8832735]
[3]   Context Refinement for Object Detection [J].
Chen, Zhe ;
Huang, Shaoli ;
Tao, Dacheng .
COMPUTER VISION - ECCV 2018, PT VIII, 2018, 11212 :74-89
[4]  
Delibasoglu I, 2021, Arxiv, DOI arXiv:2103.11460
[5]   Extended Feature Pyramid Network for Small Object Detection [J].
Deng, Chunfang ;
Wang, Mengmeng ;
Liu, Liang ;
Liu, Yong ;
Jiang, Yunliang .
IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 :1968-1979
[6]   A Small-Ship Object Detection Method for Satellite Remote Sensing Data [J].
Fan, Xiyu ;
Hu, Zhuhua ;
Zhao, Yaochi ;
Chen, Junfei ;
Wei, Tianjiao ;
Huang, Zixun .
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 :11886-11898
[7]  
Howard AG, 2017, Arxiv, DOI [arXiv:1704.04861, 10.48550/arXiv.1704.04861]
[8]  
Gao N, 2021, CHINA COMMUN, V18, P253, DOI 10.23919/JCC.2021.07.020
[9]   NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection [J].
Ghiasi, Golnaz ;
Lin, Tsung-Yi ;
Le, Quoc V. .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :7029-7038
[10]   Fast R-CNN [J].
Girshick, Ross .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1440-1448