Image-based historical manuscript dating using contour and stroke fragments

被引:33
作者
He, Sheng [1 ]
Samara, Petros [2 ]
Burgers, Jan [3 ]
Schomaker, Lambert [1 ]
机构
[1] Univ Groningen, Inst Artificial Intelligence & Cognit Engn, POB 407, NL-9700 AK Groningen, Netherlands
[2] Univ Amsterdam, Dept Hist, Spuistr 134, NL-1012 VB Amsterdam, Netherlands
[3] Huygens Inst Nederlandse Geschiedenis, POB 90754, NL-2509 LT The Hague, Netherlands
关键词
Historical manuscript dating; Writer identification; Contour fragment; Stroke fragment; Handwriting style; INDEPENDENT WRITER IDENTIFICATION; AGE ESTIMATION; RECOGNITION; FEATURES; CLASSIFICATION; BINARIZATION;
D O I
10.1016/j.patcog.2016.03.032
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Historical manuscript dating has always been an important challenge for historians but since countless manuscripts have become digitally available recently, the pattern recognition community has started addressing the dating problem as well. In this paper, we present a family of local contour fragments (kCF) and stroke fragments (kSF) features and study their application to historical document dating. kCF are formed by a number of k primary contour fragments segmented from the connected component contours of handwritten texts and kSF are formed by a segment of length k of a stroke fragment graph. The kCF and kSF are described by scale and rotation invariant descriptors and encoded into trained codebooks inspired by classical bag of words model. We evaluate our methods on the Medieval Paleographical Scale (MPS) data set and perform dating by writer identification and classification. As far as dating by writer identification is concerned, we arrive at the conclusion that features which perform well for writer identification are not necessarily suitable for historical document dating. Experimental results of dating by classification demonstrate that a combination of kCF and kSF achieves optimal results, with a mean absolute error of 14.9 years when excluding writer duplicates in training and 7.9 years when including writer duplicates in training. (C) 2016 Elsevier Ltd. All rights reserved.
引用
收藏
页码:159 / 171
页数:13
相关论文
共 55 条
[1]   A model-based approach to offline text-independent Arabic writer identification and verification [J].
Abdi, Mohamed Nidhal ;
Khemakhem, Maher .
PATTERN RECOGNITION, 2015, 48 (05) :1890-1903
[2]  
[Anonymous], P 3 INT WORKSH HIST
[3]  
[Anonymous], 2015, P 3 INT WORKSH HIST
[4]  
[Anonymous], 2004, WORKSH STAT LEARN CO
[5]  
[Anonymous], 2015, P 3 INT WORKSH HIST, DOI DOI 10.1145/2809544.2809560
[6]   New mathematical and algorithmic schemes for pattern classification with application to the identification of writers of important ancient documents [J].
Arabadjis, D. ;
Giannopoulos, F. ;
Papaodysseus, C. ;
Zannos, S. ;
Rousopoulos, P. ;
Panagopoulos, M. ;
Blackwell, C. .
PATTERN RECOGNITION, 2013, 46 (08) :2278-2296
[7]   Shape matching and object recognition using shape contexts [J].
Belongie, S ;
Malik, J ;
Puzicha, J .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2002, 24 (04) :509-522
[8]   Writer identification using directional ink-trace width measurements [J].
Brink, A. A. ;
Smit, J. ;
Bulacu, M. L. ;
Schomaker, L. R. B. .
PATTERN RECOGNITION, 2012, 45 (01) :162-171
[9]   A comparison of clustering methods for writer identification and verification [J].
Bulacu, M ;
Schomaker, L .
EIGHTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, PROCEEDINGS, 2005, :1275-1279
[10]  
Bulacu M, 2003, PROC INT CONF DOC, P937