Research on infrared small target pedestrian and vehicle detection algorithm based on multi-scale feature fusion

被引:1
作者
Xiang, Xinjian [1 ]
Zhang, Guolong [1 ]
Huang, Li [2 ]
Zheng, Yongping [1 ]
Xie, Zongyi [1 ]
Sun, Siqi [1 ]
Yuan, Tianshun [1 ]
Chen, Xizhao [1 ]
机构
[1] Zhejiang Univ Sci & Technol, Sch Automat & Elect Engn, Hangzhou 310023, Peoples R China
[2] Zhejiang Safun Ind Co Ltd, Jinhua 321300, Peoples R China
关键词
Pedestrian and vehicle detection; Infrared small target detection; Small target detection layer; Double layer routing attention; Model lightweight; VISION;
D O I
10.1007/s11554-024-01607-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Infrared imaging technology relies on detecting the electromagnetic waves emitted by an object's spontaneous thermal radiation for imaging. It can overcome the adverse effects of complex lighting conditions on the detection of pedestrians and vehicles on the road. To address the issues of low accuracy and missed detection in visual detection under complex traffic conditions, such as during rain, snow, or at night, a pedestrian and vehicle detection model using infrared imaging has been proposed. This model improves the neck network and incorporates an attention mechanism. First, by adding a multi-scale feature fusion small-object detection layer to the model's neck, enhancing the capture of detailed information about small infrared objects and reducing missed detections. Second, a novel dual-layer routing attention mechanism is designed, allowing the model to focus on the most relevant feature areas and improving the detection accuracy of small infrared objects. Next, the CARAFE upsampling method is used for adaptive upsampling and context information fusion, which enhances the model's ability to reorganize features and capture details. Finally, a lightweight CSPPC module is constructed using partial convolutions to replace the C2f module in the neck network, which improves the model's frame rate. Experimental results show that, compared to the baseline model, BCC-YOLOv8n improves precision, recall, mAP@0.5, and mAP@0.5:0.95 by 1.4%, 4.8%, 5.3%, and 4.5%, respectively, while reducing the number of parameters by approximately 7%. Additionally, a frame rate of 70.8 FPS was achieved, satisfying the requirements for real-time detection.
引用
收藏
页数:16
相关论文
共 35 条
[1]  
[Anonymous], 2009, GLOB STAT REP ROAD S
[2]   Vehicle Detection and Tracking Using Thermal Cameras in Adverse Visibility Conditions [J].
Bhadoriya, Abhay Singh ;
Vegamoor, Vamsi ;
Rathinam, Sivakumar .
SENSORS, 2022, 22 (12)
[3]   Run, Don't Walk: Chasing Higher FLOPS for Faster Neural Networks [J].
Chen, Jierun ;
Kao, Shiu-Hong ;
He, Hao ;
Zhuo, Weipeng ;
Wen, Song ;
Lee, Chul-Ho ;
Chan, S. -H. Gary .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :12021-12031
[4]   A Review of Vision-Based Traffic Semantic Understanding in ITSs [J].
Chen, Jing ;
Wang, Qichao ;
Cheng, Harry H. ;
Peng, Weiming ;
Xu, Wenqiang .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (11) :19954-19979
[5]   Pedestrian Detection at Night in Infrared Images Using an Attention-Guided Encoder-Decoder Convolutional Neural Network [J].
Chen, Yunfan ;
Shin, Hyunchul .
APPLIED SCIENCES-BASEL, 2020, 10 (03)
[6]   An integrated and real-time social distancing, mask detection, and facial temperature video measurement system for pandemic monitoring [J].
Elhanashi, Abdussalam ;
Saponara, Sergio ;
Dini, Pierpaolo ;
Zheng, Qinghe ;
Morita, Daiki ;
Raytchev, Bisser .
JOURNAL OF REAL-TIME IMAGE PROCESSING, 2023, 20 (05)
[7]  
FLIR Conservator, 2024, Roboflow FLIR self-driving thermal object-detection dataset
[8]  
Jocher G., 2020, G, J.: YOLOv5, VYOLOv5
[9]   A Unified Deep Learning Framework of Multi-scale Detectors for Geo-spatial Object Detection in High-Resolution Satellite Images [J].
Khan, Sultan Daud ;
Alarabi, Louai ;
Basalamah, Saleh .
ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2022, 47 (08) :9489-9504
[10]  
Li Jinfeng, 2024, CAICE '24: Proceedings of the 3rd International Conference on Computer, Artificial Intelligence and Control Engineering, P740, DOI 10.1145/3672758.3672881