SF-YOLO: A Novel YOLO Framework for Small Object Detection in Aerial Scenes

被引:1
作者
Sun, Meng [1 ,2 ]
Wang, Le [1 ,2 ,3 ]
Jiang, Wangyu [1 ,2 ]
Dharejo, Fayaz Ali [4 ,5 ]
Mao, Guojun [1 ,2 ,3 ]
Timofte, Radu [4 ,5 ]
机构
[1] Fujian Univ Technol, Coll Comp, Fujian Prov Key Lab Big Data Min & Applicat, Fuzhou, Peoples R China
[2] Fujian Univ Technol, Sch Comp Sci & Math, Fuzhou, Peoples R China
[3] Fujian Univ Technol, Technol Innovat Ctr Factored Transact Data Tourist, Minist Culture & Tourism, Fuzhou, Peoples R China
[4] Univ Wurzburg, Comp Vis Lab, CAIDAS, Wurzburg, Germany
[5] Univ Wurzburg, IFI, Wurzburg, Germany
关键词
computer vision; convolutional neural nets; convolution; feature extraction; object detection; MODEL;
D O I
10.1049/ipr2.70027
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Object detection models are widely applied in the fields such as video surveillance and unmanned aerial vehicles to enable the identification and monitoring of various objects on a diversity of backgrounds. The general CNN-based object detectors primarily rely on downsampling and pooling operations, often struggling with small objects that have low resolution and failing to fully leverage contextual information that can differentiate objects from complex background. To address the problems, we propose a novel YOLO framework called SF-YOLO for small object detection. Firstly, we present a spatial information perception (SIP) module to extract contextual features for different objects through the integration of space to depth operation and large selective kernel module, which dynamically adjusts receptive field of the backbone and obtains the enhanced features for richer understanding of differentiation between objects and background. Furthermore, we design a novel multi-scale feature weighted fusion strategy, which performs weighted fusion on feature maps by combining fast normalized fusion method and CARAFE operation, accurately assessing the importance of each feature and enhancing the representation of small objects. The extensive experiments conducted on VisDrone2019, Tiny-Person and PESMOD datasets demonstrate that our proposed method enables comparable detection performance to state-of-the-art detectors.
引用
收藏
页数:14
相关论文
共 60 条
[51]   Gated CNN: Integrating multi-scale feature layers for object detection [J].
Yuan, Jin ;
Xiong, Heng-Chang ;
Xiao, Yi ;
Guan, Weili ;
Wang, Meng ;
Hong, Richang ;
Li, Zhi-Yong .
PATTERN RECOGNITION, 2020, 105
[52]   VarifocalNet: An IoU-aware Dense Object Detector [J].
Zhang, Haoyang ;
Wang, Ying ;
Dayoub, Feras ;
Sunderhauf, Niko .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :8510-8519
[53]   Residual attention mechanism and weighted feature fusion for multi-scale object detection [J].
Zhang, Jie ;
Qi, Qiye ;
Zhang, Huanlong ;
Du, Qifan ;
Wang, Fengxian ;
Shi, Xiaoping .
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (26) :40873-40889
[54]   Improved Object Detection Method Utilizing YOLOv7-Tiny for Unmanned Aerial Vehicle Photographic Imagery [J].
Zhang, Linhua ;
Xiong, Ning ;
Pan, Xinghao ;
Yue, Xiaodong ;
Wu, Peng ;
Guo, Caiping .
ALGORITHMS, 2023, 16 (11)
[55]   ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices [J].
Zhang, Xiangyu ;
Zhou, Xinyu ;
Lin, Mengxiao ;
Sun, Ran .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :6848-6856
[56]  
Zhang Y., 2023, IEEE Transactions on Geoscience and Remote Sensing, V62
[57]   Drone-YOLO: An Efficient Neural Network Method for Target Detection in Drone Images [J].
Zhang, Zhengxin .
DRONES, 2023, 7 (08)
[58]  
Zhao QJ, 2019, AAAI CONF ARTIF INTE, P9259
[59]   Detection and Tracking Meet Drones Challenge [J].
Zhu, Pengfei ;
Wen, Longyin ;
Du, Dawei ;
Bian, Xiao ;
Fan, Heng ;
Hu, Qinghua ;
Ling, Haibin .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (11) :7380-7399
[60]   Traffic sign detection and recognition using fully convolutional network guided proposals [J].
Zhu, Yingying ;
Zhang, Chengquan ;
Zhou, Duoyou ;
Wang, Xinggang ;
Bai, Xiang ;
Liu, Wenyu .
NEUROCOMPUTING, 2016, 214 :758-766