Using codebooks of fragmented connected-component contours in forensic and historic writer identification

被引:72
作者
Schomaker, Lambert
Franke, Katrin
Bulacu, Marius
机构
[1] Univ Groningen, AI Inst, NL-9712 TS Groningen, Netherlands
[2] Fraunhofer IPK, D-10587 Berlin, Germany
关键词
writer identification; author identification; cursive-script segmentation;
D O I
10.1016/j.patrec.2006.08.005
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent advances in 'off-line' writer identification allow for new applications in handwritten text retrieval from archives of scanned historical documents. This paper describes new algorithms for forensic or historical writer identification, using the contours of fragmented connected-components in free-style handwriting. The writer is considered to be characterized by a stochastic pattern generator, producing a family of character fragments (fraglets). Using a codebook of such fraglets from an independent training set, the probability distribution of fraglet contours was computed for an independent test set. Results revealed a high sensitivity of the fraglet histogram in identifying individual writers on the basis of a paragraph of text. Large-scale experiments on the optimal size of Kohonen maps of fraglet contours were performed, showing usable classification rates within a non-critical range of Kohonen map dimensions. The proposed automatic approach bridges the gap between image-statistics approaches and purely knowledge-based manual character-based methods. (c) 2006 Elsevier B.V. All rights reserved.
引用
收藏
页码:719 / 727
页数:9
相关论文
共 20 条
[11]   Personal identification based on handwriting [J].
Said, HES ;
Tan, TN ;
Baker, KD .
PATTERN RECOGNITION, 2000, 33 (01) :149-160
[12]   Automatic writer identification using connected-component contours and edge-based features of uppercase western script [J].
Schomaker, L ;
Bulacu, M .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2004, 26 (06) :787-798
[13]  
Schomaker L, 2003, IEEE IMAGE PROC, P545
[14]   From handwriting analysis to pen-computer applications [J].
Schomaker, L .
ELECTRONICS & COMMUNICATION ENGINEERING JOURNAL, 1998, 10 (03) :93-102
[15]   USING STROKE-BASED OR CHARACTER-BASED SELF-ORGANIZING MAPS IN THE RECOGNITION OF ONLINE, CONNECTED CURSIVE SCRIPT [J].
SCHOMAKER, L .
PATTERN RECOGNITION, 1993, 26 (03) :443-450
[16]  
Srihari SN, 2002, J FORENSIC SCI, V47, P856
[17]  
VANERP M, 2003, P 11 C INT GRAPH SOC, P282
[18]   Architectures for detecting and solving conflicts: two-stage classification and support vector classifiers [J].
Louis Vuurpijl ;
Lambert Schomaker ;
Merijn van Erp .
Document Analysis and Recognition, 2003, 5 (4) :213-223
[19]  
VUURPIJL L, 1997, ICDAR 97, P387
[20]  
WANDA A, 2004, P 9 IWFHR TOK JAP IE