Fake visual content detection using two-stream convolutional neural networks

被引：8

作者：

Yousaf, Bilal ^{[1
]}

Usama, Muhammad ^{[2
]}

Sultani, Waqas ^{[1
]}

Mahmood, Arif ^{[1
]}

Qadir, Junaid ^{[3
,4
]}

机构：

[1] Informat Technol Univ ITU, Dept Comp Sci, Lahore, Pakistan

[2] Lahore Univ Management Sci LUMS, Lahore, Pakistan

[3] Qatar Univ, Coll Engn, Dept Comp Sci & Engn CSE, Doha, Qatar

[4] Informat Technol Univ ITU, Dept Elect Engn, Lahore, Pakistan

来源：

NEURAL COMPUTING & APPLICATIONS | 2022年 / 34卷 / 10期

关键词：

Deepfakes; Two-stream network; Frequency stream; Combination of discrete Fourier transform and discrete wavelet; FORGERIES;

D O I：

10.1007/s00521-022-06902-5

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Rapid progress in adversarial learning has enabled the generation of realistic-looking fake visual content. To distinguish between fake and real visual content, several detection techniques have been proposed. The performance of most of these techniques however drops off significantly if the test and the training data are sampled from different distributions. This motivates efforts towards improving the generalization of fake detectors. Since current fake content generation techniques do not accurately model the frequency spectrum of the natural images, we observe that the frequency spectrum of the fake visual data contains discriminative characteristics that can be used to detect fake content. We also observe that the information captured in the frequency spectrum is different from that of the spatial domain. Using these insights, we propose to complement frequency and spatial domain features using a two-stream convolutional neural network architecture called TwoStreamNet. We demonstrate the improved generalization of the proposed two-stream network to several unseen generation architectures, datasets, and techniques. The proposed detector has demonstrated significant performance improvement compared to the current state-of-the-art fake content detectors with the fusing of frequency and spatial domain streams also improving the generalization of the detector.

引用

页码：7991 / 8004

页数：14

共 68 条

[1]

Afchar D, 2018, IEEE INT WORKS INFOR

[2]

[Anonymous], 2016, PROC CVPR IEEE, DOI [DOI 10.1109/CVPR.2016.262, 10.1109/CVPR.2016.262]

[3]

[Anonymous], 1995, Recommendation ITU-R BT.601-5

[4]

[Anonymous], 1990, RECOMMENDATION ITU R

[5]

[Anonymous], 2011, Studio Encoding Parameters of Digital Television for Standard 4:3 and Wide-Screen 16:9 Aspect Ratios

[6]

Arik SÖ, 2018, ADV NEUR IN, V31

[7]

Bayar B., 2016, ACM WORKSH INF HID M, P5, DOI 10.1145/2909827.2930786

[8]

Cai HY, 2018, LECT NOTES COMPUT SC, V11206, P374, DOI [10.1007/978-3-030-01216-8_, 10.1007/978-3-030-01216-8_23]

[9] Everybody Dance Now [J].

Chan, Caroline ;

Ginosar, Shiry ;

Zhou, Tinghui ;

Efros, Alexei A. .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :5932-5941

[10] Learning to See in the Dark [J].

Chen, Chen ;

Chen, Qifeng ;

Xu, Jia ;

Koltun, Vladlen .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :3291-3300

← 1 2 3 4 5 6 7 →