Improved Deepfake Video Detection Using Convolutional Vision Transformer

被引：2

作者：

Deressa, Deressa Wodajo

Lambert, Peter ^{[1
]}

Van Wallendael, Glenn ^{[1
]}

Atnafu, Solomon ^{[2
]}

Mareen, Hannes ^{[1
]}

机构：

[1] Univ Ghent, IMEC, IDLab, Dept Elect & Informat Syst, Ghent, Belgium

[2] Addis Ababa Univ, Addis Ababa, Ethiopia

来源：

2024 IEEE GAMING, ENTERTAINMENT, AND MEDIA CONFERENCE, GEM 2024 | 2024年

关键词：

Deepfake Video Detection; Vision Transformer; Convolutional Neural Network; Misinformation Detection; Multimedia Forensics;

D O I：

10.1109/GEM61861.2024.10585593

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Deepfakes are hyper-realistic videos in which the faces are replaced, swapped, or forged using deep-learning models. This potent media manipulation techniques hold promise for applications across various domains. Yet, they also present a significant risk when employed for malicious intents like identity fraud, phishing, spreading false information, and executing scams. In this work, we propose a novel and improved Deepfake video detector that uses a Convolutional Vision Transformer (CViT2), which builds on the concepts of our previous work (CViT). The CViT architecture consists of two components: a Convolutional Neural Network that extracts learnable features, and a Vision Transformer that categorizes these learned features using an attention mechanism. We trained and evaluted our model on 5 datasets, namely Deepfake Detection Challenge Dataset (DFDC), FaceForensics++ (FF++), Celeb-DF v2, Deep-fakeTIMIT, and TrustedMedia. On the test sets unseen during training, we achieved an accuracy of 95%, 94.8%, 98.3% and 76.7% on the DFDC, FF++, Celeb-DF v2, and TIMIT datasets, respectively. In conclusion, our proposed Deepfake detector can be used in the battle against misinformation and other forensic use cases.

引用

页码：492 / 497

页数：6

共 43 条

[1] Afchar D, 2018, IEEE INT WORKS INFOR
[2] Ciftci UA, 2020, Arxiv, DOI arXiv:1901.02212
[3] Bazarevsky V, 2019, Arxiv, DOI arXiv:1907.05047
[4] Virtual Fakes: DeepFakes for Virtual Reality
Bose, Avishek Joey
Aarabi, Parham
[J]. 2019 IEEE 21ST INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP 2019), 2019,
[5] The use of the area under the roc curve in the evaluation of machine learning algorithms
Bradley, AP
[J]. PATTERN RECOGNITION, 1997, 30 (07) : 1145 - 1159
[6] On the Generality of Facial Forgery Detection
Brockschmidt, Joshua
Shang, Jiacheng
Wu, Jie
[J]. 2019 IEEE 16TH INTERNATIONAL CONFERENCE ON MOBILE AD HOC AND SENSOR SYSTEMS WORKSHOPS (MASSW 2019), 2019, : 43 - 47
[7] Charitidis P, 2020, Arxiv, DOI arXiv:2006.07084
[8] Chen WL, 2022, Arxiv, DOI arXiv:2201.04788
[9] Dhere Sourabh, 2020, 2020 International Conference on Industry 4.0 Technology (I4Tech), P191, DOI 10.1109/I4Tech48345.2020.9102668
[10] Dolhansky B, 2020, Arxiv, DOI arXiv:2006.07397

← 1 2 3 4 5 →