ScanGuard-YOLO: Enhancing X-ray Prohibited Item Detection with Significant Performance Gains

被引:6
作者
Huang, Xianning [1 ]
Zhang, Yaping [1 ]
机构
[1] Yunnan Normal Univ, Sch Informat Sci & Technol, Kunming 650500, Peoples R China
关键词
X-ray image; prohibited items detection; deep learning; YOLOv5; multiscale feature fusion;
D O I
10.3390/s24010102
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
To address the problem of low recall rate in the detection of prohibited items in X-ray images due to the severe object occlusion and complex background, an X-ray prohibited item detection network, ScanGuard-YOLO, based on the YOLOv5 architecture, is proposed to effectively improve the model's recall rate and the comprehensive metric F1 score. Firstly, the RFB-s module was added to the end part of the backbone, and dilated convolution was used to increase the receptive field of the backbone network to better capture global features. In the neck section, the efficient RepGFPN module was employed to fuse multiscale information from the backbone output. This aimed to capture details and contextual information at various scales, thereby enhancing the model's understanding and representation capability of the object. Secondly, a novel detection head was introduced to unify scale-awareness, spatial-awareness, and task-awareness altogether, which significantly improved the representation ability of the object detection heads. Finally, the bounding box regression loss function was defined as the WIOUv3 loss, effectively balancing the contribution of low-quality and high-quality samples to the loss. ScanGuard-YOLO was tested on OPIXray and HiXray datasets, showing significant improvements compared to the baseline model. The mean average precision (mAP@0.5) increased by 2.3% and 1.6%, the recall rate improved by 4.5% and 2%, and the F1 score increased by 2.3% and 1%, respectively. The experimental results demonstrate that ScanGuard-YOLO effectively enhances the detection capability of prohibited items in complex backgrounds and exhibits broad prospects for application.
引用
收藏
页数:22
相关论文
共 33 条
[1]   Grad-CAM plus plus : Generalized Gradient-based Visual Explanations for Deep Convolutional Networks [J].
Chattopadhay, Aditya ;
Sarkar, Anirban ;
Howlader, Prantik ;
Balasubramanian, Vineeth N. .
2018 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2018), 2018, :839-847
[2]   Dynamic ReLU [J].
Chen, Yinpeng ;
Dai, Xiyang ;
Liu, Mengchen ;
Chen, Dongdong ;
Yuan, Lu ;
Liu, Zicheng .
COMPUTER VISION - ECCV 2020, PT XIX, 2020, 12364 :351-367
[3]  
Chunyi L., 2023, PREPRINT
[4]   Dynamic Head: Unifying Object Detection Heads with Attentions [J].
Dai, Xiyang ;
Chen, Yinpeng ;
Xiao, Bin ;
Chen, Dongdong ;
Liu, Mengchen ;
Yuan, Lu ;
Zhang, Lei .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :7369-7378
[5]  
Ge Z, 2021, Arxiv, DOI arXiv:2107.08430
[6]  
Gevorgyan Z, 2022, Arxiv, DOI arXiv:2205.12740
[7]  
He J, 2021, ADV NEUR IN, V34
[8]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[9]   Coordinate Attention for Efficient Mobile Network Design [J].
Hou, Qibin ;
Zhou, Daquan ;
Feng, Jiashi .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :13708-13717
[10]   Densely Connected Convolutional Networks [J].
Huang, Gao ;
Liu, Zhuang ;
van der Maaten, Laurens ;
Weinberger, Kilian Q. .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2261-2269