Towards Spatio-temporal Collaborative Learning: An End-to-End Deepfake Video Detection Framework

被引：0

作者：

Guo, Wenxuan ^{[1
]}

Du, Shuo ^{[1
]}

Deng, Huiyuan ^{[1
]}

Yu, Zikang ^{[1
]}

Feng, Lin ^{[2
]}

机构：

[1] Dalian Univ Technol, Dalian, Peoples R China

[2] Dalian Univ Technol, Sch Innovat & Entrepreneurship, Dalian, Peoples R China

来源：

2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN | 2023年

关键词：

Deepfake Detection; Spatio-temporal Modeling; Face Forensics; Deep Learning;

D O I：

10.1109/IJCNN54540.2023.10191479

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

With the rapid development of facial tampering techniques, the deepfake detection task has attracted widespread social concerns. Most existing video-based methods adopt temporal convolution to learn temporal discontinuities directly, where they might neglect to explore both local detail mutation and inconsistent global expression semantics in the temporal dimension. This makes it difficult to learn more discriminative forgery cues. To mitigate this issue, we introduce a novel deepfake video detection framework specifically designed to capture fine-grained traces of tampering. Concretely, we first present a Multilayered Feature Extraction module (MFE) that constructs comprehensive spatio-temporal representations by stitching different levels of features together. Afterward, we propose a Bidirectional temporal Artifact Enhancement module (BAE), which exploits local differences between adjacent frames to enhance frame-level features. Moreover, we present a Cross temporal Stride Aggregation strategy (CSA) to mine inconsistent global semantics and adaptively obtain multi-timescale representations. Extensive experiments on several benchmarks demonstrate that the proposed method outperforms state-of-the-art performance compared to other competitive approaches.

引用

页数：8

共 50 条

[21] Spatio-Temporal Catcher: a Self-Supervised Transformer for Deepfake Video Detection
Li, Maosen
Li, Xurong
Yu, Kun
Deng, Cheng
Huang, Heng
Mao, Feng
Xue, Hui
Li, Minghao
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 8707 - 8718
[22] End-to-End Video Object Detection with Spatial-Temporal Transformers
He, Lu
Zhou, Qianyu
Li, Xiangtai
Niu, Li
Cheng, Guangliang
Li, Xiao
Liu, Wenxuan
Tong, Yunhai
Ma, Lizhuang
Zhang, Liqing
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 1507 - 1516
[23] Synthetic Temporal Anomaly Guided End-to-End Video Anomaly Detection
Astrid, Marcella
Zaheer, Muhammad Zaigham
Lee, Seung-Ik
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 207 - 214
[24] Watch Only Once: An End-to-End Video Action Detection Framework
Chen, Shoufa
Sun, Peize
Xie, Enze
Ge, Chongjian
Wu, Jiannan
Ma, Lan
Shen, Jiajun
Luo, Ping
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 8158 - 8167
[25] FVC: An End-to-End Framework Towards Deep Video Compression in Feature Space
Hu, Zhihao
Xu, Dong
Lu, Guo
Jiang, Wei
Wang, Wei
Liu, Shan
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (04) : 4569 - 4585
[26] FrameProv: Towards End-To-End Video Provenance
Ahmed-Rengers, Mansoor
NSPW'19: PROCEEDINGS OF THE NEW SECURITY PARADIGMS WORKSHOP, 2019, : 68 - 77
[27] An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data
Song, Sijie
Lan, Cuiling
Xing, Junliang
Zeng, Wenjun
Liu, Jiaying
THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 4263 - 4270
[28] Predicting spatio-temporal traffic flow: a comprehensive end-to-end approach from surveillance cameras
Feng, Yuxiang
Zhao, Yifan
Zhang, Xingchen
Batista, Sergio F. A.
Demiris, Yiannis
Angeloudis, Panagiotis
TRANSPORTMETRICA B-TRANSPORT DYNAMICS, 2024, 12 (01)
[29] An End to End Framework With Adaptive Spatio-Temporal Attention Module for Human Action Recognition
Liu, Shaocan
Ma, Xin
Wu, Hanbo
Li, Yibin
IEEE ACCESS, 2020, 8 : 47220 - 47231
[30] End-to-End Semi-Supervised Learning for Video Action Detection
Kumar, Akash
Rawat, Yogesh Singh
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 14680 - 14690

← 1 2 3 4 5 →