Exposing Deepfake Videos with Spatial, Frequency and Multi-scale Temporal Artifacts

被引：2

作者：

Hu, Yongjian ^{[1
]}

Zhao, Hongjie ^{[1
]}

Yu, Zeqiong ^{[1
]}

Liu, Beibei ^{[1
]}

Yu, Xiangyu ^{[1
]}

机构：

[1] South China Univ Technol, Guangzhou, Peoples R China

来源：

DIGITAL FORENSICS AND WATERMARKING, IWDW 2021 | 2022年 / 13180卷

关键词：

Deepfake video detection; Multi-domain features; Multi-scale temporal features; Cross-dataset performance;

D O I：

10.1007/978-3-030-95398-0_4

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

The deepfake technique replaces the face in a source video with a fake face which is generated using deep learning tools such as generative adversarial networks (GANs). Even the facial expression can be well synchronized, making it difficult to identify the fake videos. Using features from multiple domains has been proved effective in the literature. It is also known that the temporal information is particularly critical in detecting deepfake videos, since the face-swapping of a video is implemented frame by frame. In this paper, we argue that the temporal differences between authentic and fake videos are complex and can not be adequately depicted from a single time scale. To obtain a complete picture of the temporal deepfake traces, we design a detection model with a short-term feature extraction module and a long-term feature extraction module. The short-term module captures the gradient information of adjacent frames. which is incorporated with the frequency and spatial information to make a multi-domain feature set. The long-term module then reveals the artifacts from a longer period of context. The proposed algorithm is tested on several popular databases, namely FaceForensics++, DeepfakeDetection (DFD), TIMIT-DF and FFW. Experimental results have validated the effectiveness of our algorithm through improved detection performance compared with related works.

引用

页码：47 / 57

页数：11