Helmet Wearing Detection of Motorcycle Drivers Using Deep Learning Network with Residual Transformer-Spatial Attention

被引:11
作者
Chen, Shuai [1 ,2 ]
Lan, Jinhui [1 ,2 ]
Liu, Haoting [1 ,2 ]
Chen, Chengkai [1 ,2 ]
Wang, Xiaohan [1 ]
机构
[1] Univ Sci & Technol Beijing, Shunde Innovat Sch, Foshan 528399, Peoples R China
[2] Univ Sci & Technol Beijing, Sch Automat & Elect Engn, Beijing Engn Res Ctr Ind Spectrum Imaging, Beijing 100083, Peoples R China
基金
中国国家自然科学基金;
关键词
helmet wearing detection; LMNet; UAV; residual transformer-spatial attention; super-resolution reconstruction; SMALL-OBJECT DETECTION; ALGORITHM; YOLO;
D O I
10.3390/drones6120415
中图分类号
TP7 [遥感技术];
学科分类号
081102 ; 0816 ; 081602 ; 083002 ; 1404 ;
摘要
Aiming at the existing problem of unmanned aerial vehicle (UAV) aerial photography for riders' helmet wearing detection, a novel aerial remote sensing detection paradigm is proposed by combining super-resolution reconstruction, residual transformer-spatial attention, and you only look once version 5 (YOLOv5) image classifier. Due to its small target size, significant size change, and strong motion blur in UAV aerial images, the helmet detection model for riders has weak generalization ability and low accuracy. First, a ladder-type multi-attention network (LMNet) for target detection is designed to conquer these difficulties. The LMNet enables information interaction and fusion at each stage, fully extracts image features, and minimizes information loss. Second, the Residual Transformer 3D-spatial Attention Module (RT3DsAM) is proposed in this work, which digests information from global data that is important for feature representation and final classification detection. It also builds self-attention and enhances correlation between information. Third, the rider images detected by LMNet are cropped out and reconstructed by the enhanced super-resolution generative adversarial networks (ESRGAN) to restore more realistic texture information and sharp edges. Finally, the reconstructed images of riders are classified by the YOLOv5 classifier. The results of the experiment show that, when compared with the existing methods, our method improves the detection accuracy of riders' helmets in aerial photography scenes, with the target detection mean average precision (mAP) evaluation indicator reaching 91.67%, and the image classification top1 accuracy (TOP1 ACC) gaining 94.23%.
引用
收藏
页数:26
相关论文
共 56 条
[1]  
[Anonymous], 2017, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2017.690
[2]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[3]   Attention Augmented Convolutional Networks [J].
Bello, Irwan ;
Zoph, Barret ;
Vaswani, Ashish ;
Shlens, Jonathon ;
Le, Quoc V. .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :3285-3294
[4]   Soft-NMS - Improving Object Detection With One Line of Code [J].
Bodla, Navaneeth ;
Singh, Bharat ;
Chellappa, Rama ;
Davis, Larry S. .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :5562-5570
[5]   GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond [J].
Cao, Yue ;
Xu, Jiarui ;
Lin, Stephen ;
Wei, Fangyun ;
Hu, Han .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, :1971-1980
[6]   Detection of Safety Helmet Wearing Based on Improved Faster R-CNN [J].
Chen, Songbo ;
Wang, Wenbo ;
Ouyang, Ye ;
Zhu, Huiling ;
Ji, Tianyao ;
Tang, Wenhu .
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
[7]   Safety Helmet Wearing Detection in Aerial Images Using Improved YOLOv4 [J].
Chen, Wei ;
Liu, Mi ;
Zhou, Xuhong ;
Pan, Jiandong ;
Tan, Haozhi .
CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 72 (02) :3159-3174
[8]   Multi-Scale Safety Helmet Detection Based on SAS-YOLOv3-Tiny [J].
Cheng, Rao ;
He, Xiaowei ;
Zheng, Zhonglong ;
Wang, Zhentao .
APPLIED SCIENCES-BASEL, 2021, 11 (08)
[9]   Accelerating the Super-Resolution Convolutional Neural Network [J].
Dong, Chao ;
Loy, Chen Change ;
Tang, Xiaoou .
COMPUTER VISION - ECCV 2016, PT II, 2016, 9906 :391-407
[10]   The PASCAL Visual Object Classes Challenge: A Retrospective [J].
Everingham, Mark ;
Eslami, S. M. Ali ;
Van Gool, Luc ;
Williams, Christopher K. I. ;
Winn, John ;
Zisserman, Andrew .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 111 (01) :98-136