SF-YOLO: A Novel YOLO Framework for Small Object Detection in Aerial Scenes

被引:0
作者
Sun, Meng [1 ,2 ]
Wang, Le [1 ,2 ,3 ]
Jiang, Wangyu [1 ,2 ]
Dharejo, Fayaz Ali [4 ,5 ]
Mao, Guojun [1 ,2 ,3 ]
Timofte, Radu [4 ,5 ]
机构
[1] Fujian Univ Technol, Coll Comp, Fujian Prov Key Lab Big Data Min & Applicat, Fuzhou, Peoples R China
[2] Fujian Univ Technol, Sch Comp Sci & Math, Fuzhou, Peoples R China
[3] Fujian Univ Technol, Technol Innovat Ctr Factored Transact Data Tourist, Minist Culture & Tourism, Fuzhou, Peoples R China
[4] Univ Wurzburg, Comp Vis Lab, CAIDAS, Wurzburg, Germany
[5] Univ Wurzburg, IFI, Wurzburg, Germany
关键词
computer vision; convolutional neural nets; convolution; feature extraction; object detection; MODEL;
D O I
10.1049/ipr2.70027
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Object detection models are widely applied in the fields such as video surveillance and unmanned aerial vehicles to enable the identification and monitoring of various objects on a diversity of backgrounds. The general CNN-based object detectors primarily rely on downsampling and pooling operations, often struggling with small objects that have low resolution and failing to fully leverage contextual information that can differentiate objects from complex background. To address the problems, we propose a novel YOLO framework called SF-YOLO for small object detection. Firstly, we present a spatial information perception (SIP) module to extract contextual features for different objects through the integration of space to depth operation and large selective kernel module, which dynamically adjusts receptive field of the backbone and obtains the enhanced features for richer understanding of differentiation between objects and background. Furthermore, we design a novel multi-scale feature weighted fusion strategy, which performs weighted fusion on feature maps by combining fast normalized fusion method and CARAFE operation, accurately assessing the importance of each feature and enhancing the representation of small objects. The extensive experiments conducted on VisDrone2019, Tiny-Person and PESMOD datasets demonstrate that our proposed method enables comparable detection performance to state-of-the-art detectors.
引用
收藏
页数:14
相关论文
共 60 条
  • [51] Zhang Y., Ye M., Zhu G., Liu Y., Guo P., Yan J., FFCA-YOLO for Small Object Detection in Remote Sensing Images, IEEE Transactions on Geoscience and Remote Sensing, 62, (2023)
  • [52] Sunkara R., Luo T., No More Strided Convolutions or Pooling: A New CNN Building Block for Low-Resolution Images and Small Objects, Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 443-459, (2022)
  • [53] Wang Z., Chen J., Hoi S.C., Deep Learning for Image Super-Resolution: A Survey, IEEE Transactions on Pattern Analysis and Machine Intelligence, 43, 10, pp. 3365-3387, (2020)
  • [54] Wang J., Chen K., Xu R., Liu Z., Loy C.C., Lin D., Carafe: Content-Aware Reassembly of Features, pp. 3007-3016, (2019)
  • [55] Zhu P., Wen L., Du D., Et al., Detection and Tracking Meet Drones Challenge, IEEE Transactions on Pattern Analysis and Machine Intelligence, 44, 11, pp. 7380-7399, (2021)
  • [56] Yu X., Gong Y., Jiang N., Ye Q., Han Z., Scale Match for Tiny Person Detection, pp. 1257-1265, (2020)
  • [57] Delibasoglu I., UAV Images Dataset for Moving Object Detection from Moving Cameras, (2021)
  • [58] Tang S., Zhang S., Fang Y., HIC-YOLOV5: Improved YOLOV5 for Small Object Detection, pp. 6614-6619, (2024)
  • [59] Zhang L., Xiong N., Pan X., Yue X., Wu P., Guo C., Improved Object Detection Method Utilizing YOLOV7-Tiny for Unmanned Aerial Vehicle Photographic Imagery, Algorithms, 16, 11, (2023)
  • [60] Hollard L., Mohimont L., Gaveau N., Steffenel L.-A., LEYOLO, New Scalable and Efficient CNN Architecture for Object Detection, (2024)