Text-Guided Synthesis of Masked Face Images

被引:0
作者
Anjali, T. [1 ]
Masilamani, V. [1 ]
机构
[1] Indian Institute of Information Technology Design and Manufacturing Kancheepuram, Chennai
关键词
diffusion models; Masked face dataset;
D O I
10.1145/3654667
中图分类号
学科分类号
摘要
The COVID-19 pandemic has made us all understand that wearing a face mask protects us from the spread of respiratory viruses. Face authentication systems, which are trained on the basis of facial key points such as the eyes, nose, and mouth, found it difficult to identify a person when the majority of the face is covered by a face mask. Removing the mask for authentication will cause the infection to spread. The possible solutions are (a) to train face recognition systems to identify the person with the upper face features, (b) reconstruct the complete face of the person with a generative model, and (c) train the model with a dataset of the masked faces of the people. In this article, we explore the scope of generative models for image synthesis. We used stable diffusion to generate masked face images of popular celebrities on various text prompts. A realistic dataset of 15K masked face images of 100 celebrities was generated and is called the Realistic Synthetic Masked Face Dataset (RSMFD). The model and the generated dataset will be made public so that researchers can augment the dataset. To the best of our knowledge, this is the largest masked face recognition dataset with realistic images. The generated images were tested on popular deep face recognition models and achieved significant results. The dataset is also trained and tested on some of the famous image classification models, and the results are competitive. The dataset is available at https://drive.google.com/drive/folders/1yetcgUOL1TOP4rod1geGsOkIrIJHtcEw?usp=sharing. © 2024 Copyright held by the owner/author(s).
引用
收藏
相关论文
共 58 条
  • [1] Coronavirus disease (COVID-19): Vaccines and vaccine safety
  • [2] Asadi S., Cappa C.D., Barreda S., Wexler A.S., Bouvier N.M., Ristenpart W.D., Efficacy of masks and face coverings in controlling outward aerosol particle emission from expiratory activities, Scientific Reports, 10, 1, pp. 1-13, (2020)
  • [3] Avrahami O., Lischinski D., Fried O., Blended diffusion for text-driven editing of natural images, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 18208-18218, (2022)
  • [4] Blattmann A., Rombach R., Oktay K., Muller J., Ommer B., Retrieval-augmented diffusion models, Advances in Neural Information Processing Systems, 35, 2022, pp. 15309-15324, (2022)
  • [5] Cabani A., Hammoudi K., Benhabiles H., Melkemi M., MaskedFace-Net–A dataset of correctly/incorrectly masked face images in the context of COVID-19, Smart Health, 19, 2021, (2021)
  • [6] Chen X., Qing L., He X., Luo X., Xu Y., FTGAN: A fully-trained generative adversarial networks for text to face generation, (2019)
  • [7] Croitoru F.-A., Hondru V., Ionescu R.T., Shah M., Diffusion models in vision: A survey, (2022)
  • [8] Deng J., Dong W., Socher R., Li L.-J., Li K., Fei-Fei L., ImageNet: A large-scale hierarchical image database, 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248-255, (2009)
  • [9] Deng J., Guo J., An X., Zhu Z., Zafeiriou S., Masked face recognition challenge: The InsightFace track report, Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1437-1444, (2021)
  • [10] Deng J., Guo J., Xue N., Zafeiriou S., ArcFace: Additive angular margin loss for deep face recognition, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4690-4699, (2019)