Augmented Multi-Scale Spatiotemporal Inconsistency Magnifier for Generalized DeepFake Detection

被引：16

作者：

Yu, Yang ^{[1
,2
]}

Zhao, Xiaohui ^{[1
,2
]}

Ni, Rongrong ^{[1
,2
]}

Yang, Siyuan ^{[3
]}

Zhao, Yao ^{[1
]}

Kot, Alex C. ^{[4
]}

机构：

[1] Beijing Jiaotong Univ, Inst Informat Sci, Beijing 100044, Peoples R China

[2] Beijing Jiaotong Univ, Beijing Key Lab Adv Informat Sci & Network Technol, Beijing 100044, Peoples R China

[3] Nanyang Technol Univ, Interdisciplinary Grad Programme, Rapid Rich Object Search Lab, Singapore 639798, Singapore

[4] Nanyang Technol Univ, Sch Elect & Elect Engn, Nanyang 639798, Singapore

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2023年 / 25卷

关键词：

Deepfakes; Spatiotemporal phenomena; Faces; Forgery; Heating systems; Detectors; Convolution; Adversarial data augmentation; generalized DeepFake detection; global guidance; multi-scale spatiotemporal inconsistency; FORGERY DETECTION; VIDEO;

D O I：

10.1109/TMM.2023.3237322

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Recently, realistic DeepFake videos have raised severe security concerns in society. Existing video-based detection methods observe local spatial regions with the coarse temporal view, thus it is difficult to obtain subtle spatiotemporal information, resulting in limited generalization ability. In this paper, we propose a novel Augmented Multi-scale Spatiotemporal Inconsistency Magnifier (AMSIM) with a Global Inconsistency View (GIV) and a more meticulous Multi-timescale Local Inconsistency View (MLIV), focusing on mining comprehensive and more subtle spatiotemporal cues. Firstly, the GIV that includs the global spatial and long-term temporal views is established to ensure comprehensive spatiotemporal clues are captured. Then, the MLIV with the critical local spatial and multi-timescale local temporal views is designed for magnifying the indetectable spatiotemporal abnormality. Subsequently, GIV is utilized to guide MLIV to dynamically find local spatiotemporal anomalies that are highly relevant to the overall video. Finally, to further obtain a generalized framework, the adversarial data augmentation is specially designed to expand source domains and simulate unseen forgery domains. Extensive experiments on six large-scale datasets show that our AMSIM outperforms state-of-the-art detection methods and remains effective when applied to unseen forgery techniques and datasets.

引用

页码：8487 / 8498

页数：12

共 50 条

[1] Spatiotemporal Inconsistency Learning for DeepFake Video Detection
Gu, Zhihao
Chen, Yang
Yao, Taiping
Ding, Shouhong
Li, Jilin
Huang, Feiyue
Ma, Lizhuang
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 3473 - 3481
[2] MULTI-SCALE PERMUTATION ENTROPY FOR AUDIO DEEPFAKE DETECTION
Wang, Chenglong
He, Jiayi
Yi, Jiangyan
Tao, Jianhua
Zhang, Chu Yuan
Zhang, Xiaohui
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 1406 - 1410
[3] DeepFake detection with multi-scale convolution and vision transformer
Lin, Hao
Huang, Wenmin
Luo, Weiqi
Lu, Wei
DIGITAL SIGNAL PROCESSING, 2023, 134
[4] Noise-aware progressive multi-scale deepfake detection
Ding X.
Pang S.
Guo W.
Multimedia Tools and Applications, 2024, 83 (36) : 83677 - 83693
[5] DeepFake Videos Detection via Spatiotemporal Inconsistency Learning and Interactive Fusion
Ding, Xiangling
Zhu, Wenjie
Zhang, Dengyong
2022 19TH ANNUAL IEEE INTERNATIONAL CONFERENCE ON SENSING, COMMUNICATION, AND NETWORKING (SECON), 2022, : 425 - 433
[6] Learning Spatiotemporal Inconsistency via Thumbnail Layout for Face Deepfake Detection
Xu, Yuting
Liang, Jian
Sheng, Lijun
Zhang, Xiao-Yu
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (12) : 5663 - 5680
[7] MLPN: Multi-Scale Laplacian Pyramid Network for deepfake detection and localization
Zhang, Yibo
Lin, Weiguo
Xu, Junfeng
Xu, Wanshang
Xu, Yikun
JOURNAL OF INFORMATION SECURITY AND APPLICATIONS, 2025, 89
[8] Mining Generalized Multi-timescale Inconsistency for Detecting Deepfake Videos
Yu, Yang
Ni, Rongrong
Yang, Siyuan
Ni, Yu
Zhao, Yao
Kot, Alex C.
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, : 1532 - 1548
[9] M2TR: Multi-modal Multi-scale Transformers for Deepfake Detection
Wang, Junke
Wu, Zuxuan
Ouyang, Wenhao
Han, Xintong
Chen, Jingjing
Lim, Ser-Nam
Jiang, Yu-Gang
PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2022, 2022, : 615 - 623
[10] Generalized Deepfake Detection Algorithm Based on Inconsistency Between Inner and Outer Faces
Gao, Jie
Concas, Sara
Orru, Giulia
Feng, Xiaoyi
Marcialis, Gian Luca
Roli, Fabio
IMAGE ANALYSIS AND PROCESSING - ICIAP 2023 WORKSHOPS, PT I, 2024, 14365 : 343 - 355

← 1 2 3 4 5 →