Application of principal components analysis and Gaussian mixture models to printer identification

被引:0
作者
Ali, GN [1 ]
Mikkilineni, AK [1 ]
Delp, EJ [1 ]
Allebach, JP [1 ]
机构
[1] Purdue Univ, Sch Elect & Comp Engn, W Lafayette, IN 47907 USA
来源
IS&T'S NIP20: INTERNATIONAL CONFERENCE ON DIGITAL PRINTING TECHNOLOGIES, PROCEEDINGS | 2004年
关键词
D O I
暂无
中图分类号
TB8 [摄影技术];
学科分类号
0804 ;
摘要
Printer identification based on a printed document has many desirable forensic applications. In the electrophotographic process (EP) quasiperiodic banding artifacts can be used as an effective intrinsic signature. However, in text only document analysis, the absence of large midtone areas makes it difficult to capture suitable signals for banding detection. Frequency domain analysis based on the projection signals of individual characters does not provide enough resolution for proper printer identification. Advanced pattern recognition techniques and knowledge about the print mechanism can help us to device an appropriate method to detect these signatures. We can get reliable intrinsic signatures from multiple projections to build a classifier to identify the printer. Projections from individual characters can be viewed as a high dimensional data set. In order to create a highly effective pattern recognition tool, this high dimensional projection data has to be represented in a low dimensional space. The dimension reduction can be performed by some well known pattern recognition techniques. Then a classifier can be built based on the reduced dimension data set. A popular choice is the Gaussian Mixture Model where each printer can be represented by a Gaussian distribution. The distributions of all the printers help us to determine the mixing coefficient for the projection from an unknown printer. Finally, the decision making algorithm can vote for the correct printer. In this paper we will describe different classification algorithms to identify an unknown printer. We will present the experiments based on several different EP printers in our printer bank. The classification results based on different classifiers will be compared.
引用
收藏
页码:301 / 305
页数:5
相关论文
共 11 条
[1]  
Ali GN, 2003, IS&T'S NIP19: INTERNATIONAL CONFERENCE ON DIGITAL PRINTING TECHNOLOGIES, P511
[2]  
Breiman L., 1984, Classification and Regression Trees, P18
[3]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[4]  
FUKUNAGA K, 1990, STAT PATTERN RECOGNI, P400
[5]  
FUKUNAGA K, 1990, STAT PATTERN RECOGNI, P24
[6]  
Jolliffe I.T., 2002, PRINCIPAL COMPONENT, Vsecond, P199
[7]  
NABNEY IT, NETLAB ALGORITHMS PA, P273
[8]  
NABNEY IT, NETLAB ALGORITHMS PA, P79
[9]   MIXTURE DENSITIES, MAXIMUM-LIKELIHOOD AND THE EM ALGORITHM [J].
REDNER, RA ;
WALKER, HF .
SIAM REVIEW, 1984, 26 (02) :195-237
[10]  
WEBB K, 2002, STAT PATTERN RECOGNI, P319