ATBHC-YOLO: aggregate transformer and bidirectional hybrid convolution for small object detection

被引:3
作者
Liao, Dandan [1 ]
Zhang, Jianxun [1 ]
Tao, Ye [1 ]
Jin, Xie [2 ]
机构
[1] Chongqing Univ Technol, Dept Comp Sci & Engn, Chongqing 400054, Peoples R China
[2] Northern Univ Malaysia, Sintok, Malaysia
关键词
Small object detection; Attention mechanism; ATBHC-YOLO; Cross-feature fusion; IMAGES;
D O I
10.1007/s40747-024-01652-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Object detection using UAV images is a current research focus in the field of computer vision, with frequent advancements in recent years. However, many methods are ineffective for challenging UAV images that feature uneven object scales, sparse spatial distribution, and dense occlusions. We propose a new algorithm for detecting small objects in UAV images, called ATBHC-YOLO. Firstly, the MS-CET module has been introduced to enhance the model's focus on global sparse features in the spatial distribution of small objects. Secondly, the BHC-FB module is proposed to address the large-scale variance of small objects and enhance the perception of local features. Finally, a more appropriate loss function, WIoU, is used to penalise the quality variance of small object samples and further enhance the model's detection accuracy. Comparison experiments on the DIOR and VEDAI datasets validate the effectiveness and robustness of the improved method. By conducting experiments on the publicly available UAV benchmark dataset Visdrone, ATBHC-YOLO outperforms the state-of-the-art method(YOLOv7) by 3.5%.
引用
收藏
页数:15
相关论文
共 37 条
[1]   SLICING AIDED HYPER INFERENCE AND FINE-TUNING FOR SMALL OBJECT DETECTION [J].
Akyon, Fatih Cagatay ;
Altinuc, Sinan Onur ;
Temizel, Alptekin .
2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, :966-970
[2]   SyNet: An Ensemble Network for Object Detection in UAV Images [J].
Albaba, Berat Mert ;
Ozer, Sedat .
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, :10227-10234
[3]   Enhanced semantic feature pyramid network for small object detection [J].
Chen, Yuqi ;
Zhu, Xiangbin ;
Li, Yonggang ;
Wei, Yuanwang ;
Ye, Lihua .
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2023, 113
[4]   Extended Feature Pyramid Network for Small Object Detection [J].
Deng, Chunfang ;
Wang, Mengmeng ;
Liu, Liang ;
Liu, Yong ;
Jiang, Yunliang .
IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 :1968-1979
[5]   GIAOTracker: A comprehensive framework for MCMOT with global information and optimizing strategies in VisDrone 2021 [J].
Du, Yunhao ;
Wan, Junfeng ;
Zhao, Yanyun ;
Zhang, Binyu ;
Tong, Zhihang ;
Dong, Junhao .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, :2809-2819
[6]   Gaussian similarity-based adaptive dynamic label assignment for tiny object detection [J].
Fu, Ronghao ;
Chen, Chengcheng ;
Yan, Shuang ;
Heidari, Ali Asghar ;
Wang, Xianchang ;
Escorcia-Gutierrez, Jose ;
Mansour, Romany F. ;
Chene, Huiling .
NEUROCOMPUTING, 2023, 543
[7]  
Gevorgyan Z, 2022, Arxiv, DOI [arXiv:2205.12740, DOI 10.48550/ARXIV.2205.12740]
[8]  
Huang YC, 2022, AAAI CONF ARTIF INTE, P1026, DOI 10.1609/aaai.v36i1.19986
[9]   MPViT : Multi-Path Vision Transformer for Dense Prediction [J].
Lee, Youngwan ;
Kim, Jonghee ;
Willette, Jeffrey ;
Hwang, Sung Ju .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :7277-7286
[10]   MCANet: multi-scale contextual feature fusion network based on Atrous convolution [J].
Li, Ke ;
Liu, ZhanDong .
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (22) :34679-34702