Interactive Two-Stream Network Across Modalities for Deepfake Detection

被引：8

作者：

Wu, Jianghao ^{[1
]}

Zhang, Baopeng ^{[1
]}

Li, Zhaoyang ^{[1
]}

Pang, Guilin ^{[1
]}

Teng, Zhu ^{[1
]}

Fan, Jianping ^{[2
]}

机构：

[1] Beijing Jiaotong Univ, Sch Comp & Informat Technol, Beijing 100044, Peoples R China

[2] Lenovo Res, AI Lab, Beijing 100085, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2023年 / 33卷 / 11期

基金：

中国国家自然科学基金;

关键词：

Deepfake detection; inconsistency representation; cross-modality learning;

D O I：

10.1109/TCSVT.2023.3269841

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

As face forgery techniques have become more mature, the proliferation of deepfakes may threaten the security of human society. Although existing deepfake detection methods achieve good performance for in-dataset evaluation, it remains to be improved in the generalization ability, where the representation of the imperceptible artifacts plays a significant role. In this paper, we propose an Interactive Two-Stream Network (ITSNet) to explore the discriminant inconsistency representation from the perspective of cross-modality. In particular, the patch-wise Decomposable Discrete Cosine Transform (DDCT) is adopted to extract fine-grained high-frequency clues, and information from different modalities communicates with each other via a designed interaction module. To perceive the temporal inconsistency, we first develop a Short-term Embedding Module (SEM) to refine subtle local inconsistency representation between adjacent frames, and then a Long-term Embedding Module (LEM) is designed to further refine the erratic temporal inconsistency representation from the long-range perspective. Extensive experimental results conducted on three public datasets show that ITSNet outperforms the state-of-the-art methods both in terms of in-dataset and cross-dataset evaluations.

引用

页码：6418 / 6430

页数：13

共 37 条

[31] DeepFake detection based on high-frequency enhancement network for highly compressed content [J].

Gao, Jie ;

Xia, Zhaoqiang ;

Marcialis, Gian Luca ;

Dang, Chen ;

Dai, Jing ;

Feng, Xiaoyi .

EXPERT SYSTEMS WITH APPLICATIONS, 2024, 249

[32] AVENUE: A Novel Deepfake Detection Method Based on Temporal Convolutional Network and rPPG Information [J].

Birla, Lokendra ;

Saikia, Trishna ;

Gupta, Puneet .

ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2025, 16 (01)

[33] SiamNet: Exploiting source camera noise discrepancies using Siamese Network for Deepfake Detection [J].

Kingra, Staffy ;

Aggarwal, Naveen ;

Kaur, Nirmal .

INFORMATION SCIENCES, 2023, 645

[34] Siamese Network-Based Detection of Deepfake Impersonation Attacks with a Person of Interest Approach [J].

Samrouth, Khouloud ;

El Housseini, Pia ;

Deforges, Olivier .

ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2025, 21 (03)

[35] DEEPFAKE VIDEO DETECTION USING 3D-ATTENTIONAL INCEPTION CONVOLUTIONAL NEURAL NETWORK [J].

Lu, Changlei ;

Liu, Bin ;

Zhou, Wenbo ;

Chu, Qi ;

Yu, Nenghai .

2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, :3572-3576

[36] Exposing DeepFake Video Detection Based on Convolutional Long Short-Term Memory Network [J].

Zheng Bowen ;

Xia Huawei ;

Chen Ruidong ;

Han Qiankun .

LASER & OPTOELECTRONICS PROGRESS, 2021, 58 (24)

[37] Exposing low-quality deepfake videos of Social Network Service using Spatial Restored Detection Framework [J].

Li, Ying ;

Bian, Shan ;

Wang, Chuntao ;

Polat, Kemal ;

Alhudhaif, Adi ;

Alenezi, Fayadh .

EXPERT SYSTEMS WITH APPLICATIONS, 2023, 231

← 1 2 3 4 →