Improved Deepfake Video Detection Using Convolutional Vision Transformer

被引:2
作者
Deressa, Deressa Wodajo
Lambert, Peter [1 ]
Van Wallendael, Glenn [1 ]
Atnafu, Solomon [2 ]
Mareen, Hannes [1 ]
机构
[1] Univ Ghent, IMEC, IDLab, Dept Elect & Informat Syst, Ghent, Belgium
[2] Addis Ababa Univ, Addis Ababa, Ethiopia
来源
2024 IEEE GAMING, ENTERTAINMENT, AND MEDIA CONFERENCE, GEM 2024 | 2024年
关键词
Deepfake Video Detection; Vision Transformer; Convolutional Neural Network; Misinformation Detection; Multimedia Forensics;
D O I
10.1109/GEM61861.2024.10585593
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Deepfakes are hyper-realistic videos in which the faces are replaced, swapped, or forged using deep-learning models. This potent media manipulation techniques hold promise for applications across various domains. Yet, they also present a significant risk when employed for malicious intents like identity fraud, phishing, spreading false information, and executing scams. In this work, we propose a novel and improved Deepfake video detector that uses a Convolutional Vision Transformer (CViT2), which builds on the concepts of our previous work (CViT). The CViT architecture consists of two components: a Convolutional Neural Network that extracts learnable features, and a Vision Transformer that categorizes these learned features using an attention mechanism. We trained and evaluted our model on 5 datasets, namely Deepfake Detection Challenge Dataset (DFDC), FaceForensics++ (FF++), Celeb-DF v2, Deep-fakeTIMIT, and TrustedMedia. On the test sets unseen during training, we achieved an accuracy of 95%, 94.8%, 98.3% and 76.7% on the DFDC, FF++, Celeb-DF v2, and TIMIT datasets, respectively. In conclusion, our proposed Deepfake detector can be used in the battle against misinformation and other forensic use cases.
引用
收藏
页码:492 / 497
页数:6
相关论文
共 43 条
  • [1] Afchar D, 2018, IEEE INT WORKS INFOR
  • [2] Ciftci UA, 2020, Arxiv, DOI arXiv:1901.02212
  • [3] Bazarevsky V, 2019, Arxiv, DOI arXiv:1907.05047
  • [4] Virtual Fakes: DeepFakes for Virtual Reality
    Bose, Avishek Joey
    Aarabi, Parham
    [J]. 2019 IEEE 21ST INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP 2019), 2019,
  • [5] The use of the area under the roc curve in the evaluation of machine learning algorithms
    Bradley, AP
    [J]. PATTERN RECOGNITION, 1997, 30 (07) : 1145 - 1159
  • [6] On the Generality of Facial Forgery Detection
    Brockschmidt, Joshua
    Shang, Jiacheng
    Wu, Jie
    [J]. 2019 IEEE 16TH INTERNATIONAL CONFERENCE ON MOBILE AD HOC AND SENSOR SYSTEMS WORKSHOPS (MASSW 2019), 2019, : 43 - 47
  • [7] Charitidis P, 2020, Arxiv, DOI arXiv:2006.07084
  • [8] Chen WL, 2022, Arxiv, DOI arXiv:2201.04788
  • [9] Dhere Sourabh, 2020, 2020 International Conference on Industry 4.0 Technology (I4Tech), P191, DOI 10.1109/I4Tech48345.2020.9102668
  • [10] Dolhansky B, 2020, Arxiv, DOI arXiv:2006.07397