Exposing DeepFake Videos Using Attention Based Convolutional LSTM Network

被引:12
作者
Su, Yishan [1 ]
Xia, Huawei [1 ]
Liang, Qi [2 ]
Nie, Weizhi [1 ]
机构
[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China
[2] Tianjin Univ, Sch Microelect, Tianjin 300072, Peoples R China
基金
中国国家自然科学基金;
关键词
DeepFake detection; Convolutional LSTM; Attention;
D O I
10.1007/s11063-021-10588-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The detection of face tampering in videos created by artificial intelligence techniques (commonly known as the Deep Fakes) has become an important and challenging task in network security defense. In this paper, we propose a novel attention-based deep fake video detection method, which captures the sharp changes in terms of the facial features caused by the composite video. We utilize the convolutional long short-term memory to extract both spatial and temporal information of DeeFake videos. Meanwhile, we apply the attention mechanism to emphasize the specific facial area of each video frame. Finally, we design a decoder to further fusion multiple frames information for more accurate detection results. Experimental results and comparisons with state-of-the-art methods demonstrate that our framework achieves superior performance.
引用
收藏
页码:4159 / 4175
页数:17
相关论文
共 50 条
[21]  
Mitra A., 2021, SN COMPUT SCI, V2, P98, DOI [DOI 10.1007/S42979-021-00495-X, 10.1007/ s42979-021-00495-x]
[22]   Multi-task Learning for Detecting and Segmenting Manipulated Facial Images and Videos [J].
Nguyen, Huy H. ;
Fang, Fuming ;
Yamagishi, Junichi ;
Echizen, Isao .
2019 IEEE 10TH INTERNATIONAL CONFERENCE ON BIOMETRICS THEORY, APPLICATIONS AND SYSTEMS (BTAS), 2019,
[23]   JGAN: A Joint Formulation of GAN for Synthesizing Images and Labels [J].
Park, Minje .
IEEE ACCESS, 2020, 8 :188883-188888
[24]   FaceForensics plus plus : Learning to Detect Manipulated Facial Images [J].
Roessler, Andreas ;
Cozzolino, Davide ;
Verdoliva, Luisa ;
Riess, Christian ;
Thies, Justus ;
Niessner, Matthias .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :1-11
[25]   Blocking artifacts in speech/audio: Dynamic auditory model-based characterization and optimal time-frequency smoothing [J].
Seelamantula, Chandra Sekhar ;
Sreenivas, Thippur V. .
SIGNAL PROCESSING, 2009, 89 (04) :523-531
[26]   A multilevel structural technique for fingerprint representation and matching [J].
Shalaby, M. A. Wahby ;
Ahmad, M. Omair .
SIGNAL PROCESSING, 2013, 93 (01) :56-69
[27]   Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition [J].
Shi, Lei ;
Zhang, Yifan ;
Cheng, Jian ;
Lu, Hanqing .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :12018-12027
[28]  
Shi XJ, 2015, ADV NEUR IN, V28
[29]  
Simonyan Karen, 2015, Comput. Sci.
[30]  
Singh A., 2020, SN Comput. Sci., V1, P212, DOI [DOI 10.1007/S42979-020-00225-9, 10.1007/s42979-020-00225-9]