Towards Spatio-temporal Collaborative Learning: An End-to-End Deepfake Video Detection Framework

被引:0
|
作者
Guo, Wenxuan [1 ]
Du, Shuo [1 ]
Deng, Huiyuan [1 ]
Yu, Zikang [1 ]
Feng, Lin [2 ]
机构
[1] Dalian Univ Technol, Dalian, Peoples R China
[2] Dalian Univ Technol, Sch Innovat & Entrepreneurship, Dalian, Peoples R China
关键词
Deepfake Detection; Spatio-temporal Modeling; Face Forensics; Deep Learning;
D O I
10.1109/IJCNN54540.2023.10191479
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the rapid development of facial tampering techniques, the deepfake detection task has attracted widespread social concerns. Most existing video-based methods adopt temporal convolution to learn temporal discontinuities directly, where they might neglect to explore both local detail mutation and inconsistent global expression semantics in the temporal dimension. This makes it difficult to learn more discriminative forgery cues. To mitigate this issue, we introduce a novel deepfake video detection framework specifically designed to capture fine-grained traces of tampering. Concretely, we first present a Multilayered Feature Extraction module (MFE) that constructs comprehensive spatio-temporal representations by stitching different levels of features together. Afterward, we propose a Bidirectional temporal Artifact Enhancement module (BAE), which exploits local differences between adjacent frames to enhance frame-level features. Moreover, we present a Cross temporal Stride Aggregation strategy (CSA) to mine inconsistent global semantics and adaptively obtain multi-timescale representations. Extensive experiments on several benchmarks demonstrate that the proposed method outperforms state-of-the-art performance compared to other competitive approaches.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] Spatio-Temporal Catcher: a Self-Supervised Transformer for Deepfake Video Detection
    Li, Maosen
    Li, Xurong
    Yu, Kun
    Deng, Cheng
    Huang, Heng
    Mao, Feng
    Xue, Hui
    Li, Minghao
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 8707 - 8718
  • [22] End-to-End Video Object Detection with Spatial-Temporal Transformers
    He, Lu
    Zhou, Qianyu
    Li, Xiangtai
    Niu, Li
    Cheng, Guangliang
    Li, Xiao
    Liu, Wenxuan
    Tong, Yunhai
    Ma, Lizhuang
    Zhang, Liqing
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 1507 - 1516
  • [23] Synthetic Temporal Anomaly Guided End-to-End Video Anomaly Detection
    Astrid, Marcella
    Zaheer, Muhammad Zaigham
    Lee, Seung-Ik
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 207 - 214
  • [24] Watch Only Once: An End-to-End Video Action Detection Framework
    Chen, Shoufa
    Sun, Peize
    Xie, Enze
    Ge, Chongjian
    Wu, Jiannan
    Ma, Lan
    Shen, Jiajun
    Luo, Ping
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 8158 - 8167
  • [25] FVC: An End-to-End Framework Towards Deep Video Compression in Feature Space
    Hu, Zhihao
    Xu, Dong
    Lu, Guo
    Jiang, Wei
    Wang, Wei
    Liu, Shan
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (04) : 4569 - 4585
  • [26] FrameProv: Towards End-To-End Video Provenance
    Ahmed-Rengers, Mansoor
    NSPW'19: PROCEEDINGS OF THE NEW SECURITY PARADIGMS WORKSHOP, 2019, : 68 - 77
  • [27] An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data
    Song, Sijie
    Lan, Cuiling
    Xing, Junliang
    Zeng, Wenjun
    Liu, Jiaying
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 4263 - 4270
  • [28] Predicting spatio-temporal traffic flow: a comprehensive end-to-end approach from surveillance cameras
    Feng, Yuxiang
    Zhao, Yifan
    Zhang, Xingchen
    Batista, Sergio F. A.
    Demiris, Yiannis
    Angeloudis, Panagiotis
    TRANSPORTMETRICA B-TRANSPORT DYNAMICS, 2024, 12 (01)
  • [29] An End to End Framework With Adaptive Spatio-Temporal Attention Module for Human Action Recognition
    Liu, Shaocan
    Ma, Xin
    Wu, Hanbo
    Li, Yibin
    IEEE ACCESS, 2020, 8 : 47220 - 47231
  • [30] End-to-End Semi-Supervised Learning for Video Action Detection
    Kumar, Akash
    Rawat, Yogesh Singh
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 14680 - 14690