Constrained Convolutional Neural Networks: A New Approach Towards General Purpose Image Manipulation Detection

被引:386
作者
Bayar, Belhassen [1 ]
Stamm, Matthew C. [1 ]
机构
[1] Drexel Univ, Dept Elect & Comp Engn, Philadelphia, PA 19104 USA
基金
美国国家科学基金会;
关键词
Image forensics; deep learning; convolutional neural networks; deep convolutional features; FORENSIC DETECTION; JPEG COMPRESSION;
D O I
10.1109/TIFS.2018.2825953
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Identifying the authenticity and processing history of an image is an important task in multimedia forensics. By analyzing traces left by different image manipulations, researchers have been able to develop several algorithms capable of detecting targeted editing operations. While this approach has led to the development of several successful forensic algorithms, an important problem remains: creating forensic detectors for different image manipulations is a difficult and time consuming process. Furthermore, forensic analysts need general purpose forensic algorithms capable of detecting multiple different image manipulations. In this paper, we address both of these problems by proposing a new general purpose forensic approach using convolutional neural networks (CNNs). While CNNs are capable of learning classification features directly from data, in their existing form they tend to learn features representative of an image's content. To overcome this issue, we have developed a new type of CNN layer, called a constrained convolutional layer, that is able to jointly suppress an image's content and adaptively learn manipulation detection features. Through a series of experiments, we show that our proposed constrained CNN is able to learn manipulation detection features directly from data. Our experimental results demonstrate that our CNN can detect multiple different editing operations with up to 99.97% accuracy and outperform the existing state-of-the-art general purpose manipulation detector. Furthermore, our constrained CNN can still accurately detect image manipulations in realistic scenarios where there is a source camera model mismatch between the training and testing data.
引用
收藏
页码:2691 / 2706
页数:16
相关论文
共 48 条
[1]   Convolutional Neural Networks for Speech Recognition [J].
Abdel-Hamid, Ossama ;
Mohamed, Abdel-Rahman ;
Jiang, Hui ;
Deng, Li ;
Penn, Gerald ;
Yu, Dong .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (10) :1533-1545
[2]  
[Anonymous], 2011, P 10 INT WORKSHOP DI
[3]  
[Anonymous], 2012, ADADELTA ADAPTIVE LE
[4]  
[Anonymous], 2010, J DIGITAL FORENSIC P
[5]  
[Anonymous], 2017, Electronic Imaging, DOI DOI 10.2352/ISSN.2470-1173.2017.7.MWSF-328
[6]  
[Anonymous], 2015, PREPRINT
[7]  
[Anonymous], 2014, P 31 INT C INT C MAC
[8]  
[Anonymous], P 5 ACM WORKSH INF H
[9]  
[Anonymous], P IS T S EL IM MED W
[10]  
[Anonymous], 2014, ACM INT C MULTIMEDIA