Towards Spatio-temporal Collaborative Learning: An End-to-End Deepfake Video Detection Framework

被引:0
|
作者
Guo, Wenxuan [1 ]
Du, Shuo [1 ]
Deng, Huiyuan [1 ]
Yu, Zikang [1 ]
Feng, Lin [2 ]
机构
[1] Dalian Univ Technol, Dalian, Peoples R China
[2] Dalian Univ Technol, Sch Innovat & Entrepreneurship, Dalian, Peoples R China
关键词
Deepfake Detection; Spatio-temporal Modeling; Face Forensics; Deep Learning;
D O I
10.1109/IJCNN54540.2023.10191479
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the rapid development of facial tampering techniques, the deepfake detection task has attracted widespread social concerns. Most existing video-based methods adopt temporal convolution to learn temporal discontinuities directly, where they might neglect to explore both local detail mutation and inconsistent global expression semantics in the temporal dimension. This makes it difficult to learn more discriminative forgery cues. To mitigate this issue, we introduce a novel deepfake video detection framework specifically designed to capture fine-grained traces of tampering. Concretely, we first present a Multilayered Feature Extraction module (MFE) that constructs comprehensive spatio-temporal representations by stitching different levels of features together. Afterward, we propose a Bidirectional temporal Artifact Enhancement module (BAE), which exploits local differences between adjacent frames to enhance frame-level features. Moreover, we present a Cross temporal Stride Aggregation strategy (CSA) to mine inconsistent global semantics and adaptively obtain multi-timescale representations. Extensive experiments on several benchmarks demonstrate that the proposed method outperforms state-of-the-art performance compared to other competitive approaches.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] SFormer: An end-to-end spatio-temporal transformer architecture for deepfake detection
    Kingra, Staffy
    Aggarwal, Naveen
    Kaur, Nirmal
    FORENSIC SCIENCE INTERNATIONAL-DIGITAL INVESTIGATION, 2024, 51
  • [2] End-to-End Learning of Video Compression Using Spatio-Temporal Autoencoders
    Pessoa, Jorge
    Aidos, Helena
    Tomas, Pedro
    Figueiredo, Mario A. T.
    2020 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS), 2020, : 276 - 281
  • [3] End-to-end Multi-task Learning Framework for Spatio-Temporal Grounding in Video Corpus
    Gao, Yingqi
    Luo, Zhiling
    Chen, Shiqian
    Zhou, Wei
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 3958 - 3962
  • [4] End-to-End Spatio-Temporal Action Localisation with Video Transformers
    Gritsenko, Alexey A.
    Xiong, Xuehan
    Djolonga, Josip
    Dehghani, Mostafa
    Sun, Chen
    Lucic, Mario
    Schmid, Cordelia
    Arnab, Anurag
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 18373 - 18383
  • [5] An end-to-end explainability framework for spatio-temporal predictive modeling
    Altieri, Massimiliano
    Ceci, Michelangelo
    Corizzo, Roberto
    MACHINE LEARNING, 2025, 114 (04)
  • [6] Dynamic Difference Learning With Spatio-Temporal Correlation for Deepfake Video Detection
    Yin, Qilin
    Lu, Wei
    Li, Bin
    Huang, Jiwu
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2023, 18 : 4046 - 4058
  • [7] Crucio: End-to-End Coordinated Spatio-Temporal Redundancy Elimination for Fast Video Analytics
    Zhu, Andong
    Zhang, Sheng
    Shi, Xiaohang
    Cheng, Ke
    Sun, Hesheng
    Lu, Sanglu
    IEEE INFOCOM 2024-IEEE CONFERENCE ON COMPUTER COMMUNICATIONS, 2024, : 1191 - 1200
  • [8] End-to-End Spatio-temporal Latency Prediction for Vehicular Applications
    Drissi, Maroua
    Allio, Sylvain
    20TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE, IWCMC 2024, 2024, : 126 - 131
  • [9] CoSTA: End-to-End Comprehensive Space-Time Entanglement for Spatio-Temporal Video Grounding
    Liang, Yaoyuan
    Liang, Xiao
    Tang, Yansong
    Yang, Zhao
    Li, Ziran
    Wang, Jingang
    Ding, Wenbo
    Huang, Shao-Lun
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 4, 2024, : 3324 - 3332
  • [10] An End-to-End Learning Framework for Video Compression
    Lu, Guo
    Zhang, Xiaoyun
    Ouyang, Wanli
    Chen, Li
    Gao, Zhiyong
    Xu, Dong
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (10) : 3292 - 3308