Swin-YOLO for Concealed Object Detection in Millimeter Wave Images

被引:10
作者
Huang, Pingping [1 ,2 ]
Wei, Ran [1 ,2 ]
Su, Yun [1 ,2 ]
Tan, Weixian [1 ,2 ]
机构
[1] Inner Mongolia Univ Technol, Coll Informat Engn, Hohhot 010051, Peoples R China
[2] Inner Mongolia Key Lab Radar Technol & Applicat, Hohhot 010051, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 17期
关键词
millimeter wave images; concealed object detection; Swin Transformer; attention mechanism; SEGMENTATION;
D O I
10.3390/app13179793
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Concealed object detection in millimeter wave (MMW) images has gained significant attention in the realm of public safety, primarily due to its distinctive advantages of non-hazardous and non-contact operation. However, this undertaking confronts substantial challenges in practical applications, owing to the inherent limitations of low imaging resolution, small concealed object size, intricate environmental noise, and the need for real-time performance. In this study, we propose Swin-YOLO, an innovative single-stage detection model built upon transformer layers. Our approach encompasses several key contributions. Firstly, the integration of Local Perception Swin Transform Layers (LPST Layers) enhanced the network's capability to acquire contextual information and local awareness. Secondly, we introduced a novel feature fusion layer and a specialized prediction head for detecting small targets, effectively leveraging the network's shallow feature information. Lastly, a coordinate attention (CA) module was seamlessly incorporated between the neck network and the detection head, augmenting the network's sensitivity towards critical regions of small objects. To validate the efficacy and feasibility of our proposed method, we created a new MMW dataset containing a large number of small concealed objects and conducted comprehensive experiments to evaluate the effectiveness of overall and partial improvements, as well as computational efficiency. The results demonstrated a remarkable 4.7% improvement in the mean Average Precision (mAP) for Swin-YOLO compared with the YOLOv5 baseline. Moreover, when compared with other enhanced transformer-based models, Swin-YOLO exhibited a superior accuracy and the fastest inference speed. The proposed model showcases enhanced performance and holds promise for advancing the capabilities of real-world applications in public safety domains.
引用
收藏
页数:23
相关论文
共 50 条
[1]  
[Anonymous], 2013, P 2013 INT C IT CONV
[2]  
[Anonymous], 2016, Comput. Vis. Pattern Recogn.
[3]   A Defect Detection Method Based on BC-YOLO for Transmission Line Components in UAV Remote Sensing Images [J].
Bao, Wenxia ;
Du, Xiang ;
Wang, Nian ;
Yuan, Mu ;
Yang, Xianjun .
REMOTE SENSING, 2022, 14 (20)
[4]  
Bochkovskiy A, 2020, Arxiv, DOI arXiv:2004.10934
[5]  
Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929
[6]   A Novel Method for 3-D Millimeter-Wave Holographic Reconstruction Based on Frequency Interferometry Techniques [J].
Gao, Jingkun ;
Qin, Yuliang ;
Deng, Bin ;
Wang, Hongqiang ;
Li, Xiang .
IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, 2018, 66 (03) :1579-1596
[7]   Fast R-CNN [J].
Girshick, Ross .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1440-1448
[8]   Rich feature hierarchies for accurate object detection and semantic segmentation [J].
Girshick, Ross ;
Donahue, Jeff ;
Darrell, Trevor ;
Malik, Jitendra .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :580-587
[9]  
github, Ultralytics YOLOv5
[10]  
github, Ultralytics YOLOv8