A cascaded deep-learning-based model for face mask detection

被引：6

作者：

Kumar, Akhil ^{[1
]}

机构：

[1] Himachal Pradesh Univ, Dept Comp Sci, Shimla, India

来源：

DATA TECHNOLOGIES AND APPLICATIONS | 2023年 / 57卷 / 01期

关键词：

Face mask detection; ResNet34; YOLO; SPP layer; Deep learning; RECOGNITION;

D O I：

10.1108/DTA-02-2022-0076

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Purpose - This work aims to present a deep learning model for face mask detection in surveillance environments such as automatic teller machines (ATMs), banks, etc. to identify persons wearing face masks. In surveillance environments, complete visibility of the face area is a guideline, and criminals and law offenders commit crimes by hiding their faces behind a face mask. The face mask detector model proposed in this work can be used as a tool and integrated with surveillance cameras in autonomous surveillance environments to identify and catch law offenders and criminals. Design/methodology/approach - The proposed face mask detector is developed by integrating the residual network (ResNet)34 feature extractor on top of three You Only Look Once (YOLO) detection layers along with the usage of the spatial pyramid pooling (SPP) layer to extract a rich and dense feature map. Furthermore, at the training time, data augmentation operations such as Mosaic and MixUp have been applied to the feature extraction network so that it can get trained with images of varying complexities. The proposed detector is trained and tested over a custom face mask detection dataset consisting of 52,635 images. For validation, comparisons have been provided with the performance of YOLO v1, v2, tiny YOLO v1, v2, v3 and v4 and other benchmark work present in the literature by evaluating performance metrics such as precision, recall, F1 score, mean average precision (mAP) for the overall dataset and average precision (AP) for each class of the dataset. Findings - The proposed face mask detector achieved 4.75-9.75 per cent higher detection accuracy in terms of mAP, 5-31 per cent higher AP for detection of faces with masks and, specifically, 2-30 per cent higher AP for detection of face masks on the face region as compared to the tested baseline variants of YOLO. Furthermore, the usage of the ResNet34 feature extractor and SPP layer in the proposed detection model reduced the training time and the detection time. The proposed face mask detection model can perform detection over an image in 0.45 s, which is 0.2-0.15 s lesser than that for other tested YOLO variants, thus making the proposed detection model perform detections at a higher speed. Research limitations/implications - The proposed face mask detector model can be utilized as a tool to detect persons with face masks who are a potential threat to the automatic surveillance environments such as ATMs, banks, airport security checks, etc. The other research implication of the proposed work is that it can be trained and tested for other object detection problems such as cancer detection in images, fish species detection, vehicle detection, etc. Practical implications - The proposed face mask detector can be integrated with automatic surveillance systems and used as a tool to detect persons with face masks who are potential threats to ATMs, banks, etc. and in the present times of COVID-19 to detect if the people are following a COVID-appropriate behavior of wearing a face mask or not in the public areas. Originality/value - The novelty of this work lies in the usage of the ResNet34 feature extractor with YOLO detection layers, whichmakes the proposedmodel a compact and powerful convolutional neural-network-based face mask detector model. Furthermore, the SPP layer has been applied to the ResNet34 feature extractor to make it able to extract a rich and dense feature map. The other novelty of the present work is the implementation of Mosaic and MixUp data augmentation in the training network that provided the feature extractor with 3x images of varying complexities and orientations and further aided in achieving higher detection accuracy. The proposed model is novel in terms of extracting rich features, performing augmentation at the training time and achieving high detection accuracy while maintaining the detection speed.

引用

页码：84 / 107

页数：24

共 32 条

[1] Optimizing Expected Intersection-over-Union with Candidate-Constrained CRFs [J].

Ahmed, Faruk ;

Tarlow, Daniel ;

Batra, Dhruv .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1850-1858

[2]

Babwin D., 2020, AP News16 May

[3]

Bochkovskiy A, 2020, Arxiv, DOI [arXiv:2004.10934, DOI 10.48550/ARXIV.2004.10934]

[4] Face-mask recognition for fraud prevention using Gaussian mixture model [J].

Chen, Ququ ;

Sang, Lei .

JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2018, 55 :795-801

[5]

Frasson C, 2021, P 1 INT C NOV INT DI, V338, P247

[6]

Gaiss K., 2021, WCAX3

[7] Detecting Masked Faces in the Wild with LLE-CNNs [J].

Ge, Shiming ;

Li, Jia ;

Ye, Qiting ;

Luo, Zhao .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :426-434

[8] Novel Face Mask Detection Technique using Machine Learning to control COVID’19 pandemic [J].

Gupta S. ;

Sreenivasu S.V.N. ;

Chouhan K. ;

Shrivastava A. ;

Sahu B. ;

Manohar Potdar R. .

Materials Today: Proceedings, 2023, 80 :3714-3718

[9] The detection of spoofing by 3D mask in a 2D identity recognition system [J].

Hamdan, Bensenane ;

Mokhtar, Keche .

EGYPTIAN INFORMATICS JOURNAL, 2018, 19 (02) :75-82

[10]

He K., 2014, arXiv, DOI DOI 10.48550/ARXIV.1406.4729

← 1 2 3 4 →