Small Target Detection Model in Aerial Images Based on TCA-YOLOv5m

被引:8
作者
Huang, Min [1 ,2 ]
Zhang, Yiyan [2 ]
Chen, Yazhou [1 ]
机构
[1] Army Engn Univ, Natl Key Lab Electromagnet Environm Effects, Shijiazhuang Campus, Shijiazhuang 050003, Peoples R China
[2] Hebei Univ Sci & Technol, Sch Informat Sci & Engn, Shijiazhuang 050018, Peoples R China
关键词
Feature extraction; Object detection; Classification algorithms; Proposals; Prediction algorithms; Transformers; Deep learning; Aerial images; small target detection; TCA-YOLOv5m; transformer algorithm; coordinate attention; path aggregation network; LANGUAGE; TRENDS;
D O I
10.1109/ACCESS.2022.3232293
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Target detection in aerial images taken by unmanned aerial vehicles is the most widely used scene at present. Compared with ordinary images, the background of aerial images is more complex, and the target size is smaller, which results in inferior detection precision and a high false detection rate. This paper proposes a new small target detection model TCA-YOLOv5m, which is based on YOLOv5m and combines the Transformer algorithm and the Coordinate Attention (CA) mechanism. In this model, the transformer algorithm is added to the end of the backbone of the YOLOv5, which enables the model to mine more features information of images. In the neck layer of the TCA-YOLOv5m, the Path Aggregation Network (PANet) and transformer algorithm are combined to enhance the expression capacity for the feature pyramid and improve the detection precision of occluded high-density small targets, and CA is introduced to more accurately locate targets in high-density scenes. In addition, the TCA-YOLOv5m adds a detection layer to improve the ability to capture small targets. This paper uses VisDrone 2019 as experimental data, and takes experiments to compare the detection precision and detection speed of the proposed model with baseline models. The experiment results indicate that the detection precision of the TCA-YOLOv5m reaches 97.4%, which is 5.2% higher than that of YOLOv5; the value of MAP @ 50 reaches 58.5%, which is 14.8% higher than YOLOv5. The Frames Per Second (FPS) of the TCA-YOLOv5m is 12.96 f/s, which ensures a certain real-time performance. Therefore, the TCA-YOLOv5m is suitable for the task of detecting dense small targets in aerial images.
引用
收藏
页码:3352 / 3366
页数:15
相关论文
共 35 条
[31]   Object Detection Algorithm Based on Improved YOLOv3 [J].
Zhao, Liquan ;
Li, Shuaiyang .
ELECTRONICS, 2020, 9 (03)
[32]   Using Combined Difference Image and k-Means Clustering for SAR Image Change Detection [J].
Zheng, Yaoguo ;
Zhang, Xiangrong ;
Hou, Biao ;
Liu, Ganchao .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2014, 11 (03) :691-695
[33]  
Zhou K, 2016, DESTECH TRANS COMP
[34]  
Zhu YP, 2012, INT CONF ACOUST SPEE, P2069, DOI 10.1109/ICASSP.2012.6288317
[35]   Steganalysis based on markov model of thresholded prediction-error image [J].
Zou, Dekun ;
Shi, Yun Q. ;
Su, Wei ;
Xuan, Guorong .
2006 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO - ICME 2006, VOLS 1-5, PROCEEDINGS, 2006, :1365-1368