Image-based historical manuscript dating using contour and stroke fragments

被引:33
作者
He, Sheng [1 ]
Samara, Petros [2 ]
Burgers, Jan [3 ]
Schomaker, Lambert [1 ]
机构
[1] Univ Groningen, Inst Artificial Intelligence & Cognit Engn, POB 407, NL-9700 AK Groningen, Netherlands
[2] Univ Amsterdam, Dept Hist, Spuistr 134, NL-1012 VB Amsterdam, Netherlands
[3] Huygens Inst Nederlandse Geschiedenis, POB 90754, NL-2509 LT The Hague, Netherlands
关键词
Historical manuscript dating; Writer identification; Contour fragment; Stroke fragment; Handwriting style; INDEPENDENT WRITER IDENTIFICATION; AGE ESTIMATION; RECOGNITION; FEATURES; CLASSIFICATION; BINARIZATION;
D O I
10.1016/j.patcog.2016.03.032
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Historical manuscript dating has always been an important challenge for historians but since countless manuscripts have become digitally available recently, the pattern recognition community has started addressing the dating problem as well. In this paper, we present a family of local contour fragments (kCF) and stroke fragments (kSF) features and study their application to historical document dating. kCF are formed by a number of k primary contour fragments segmented from the connected component contours of handwritten texts and kSF are formed by a segment of length k of a stroke fragment graph. The kCF and kSF are described by scale and rotation invariant descriptors and encoded into trained codebooks inspired by classical bag of words model. We evaluate our methods on the Medieval Paleographical Scale (MPS) data set and perform dating by writer identification and classification. As far as dating by writer identification is concerned, we arrive at the conclusion that features which perform well for writer identification are not necessarily suitable for historical document dating. Experimental results of dating by classification demonstrate that a combination of kCF and kSF achieves optimal results, with a mean absolute error of 14.9 years when excluding writer duplicates in training and 7.9 years when including writer duplicates in training. (C) 2016 Elsevier Ltd. All rights reserved.
引用
收藏
页码:159 / 171
页数:13
相关论文
共 55 条
[21]   Towards style-based dating of historical documents [J].
He, Sheng ;
Samara, Petros ;
Burgers, Jan ;
Schomaker, Lambert .
2014 14TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2014, :265-270
[22]   Junction detection in handwritten documents and its application to writer identification [J].
He, Sheng ;
Wiering, Marco ;
Schomaker, Lambert .
PATTERN RECOGNITION, 2015, 48 (12) :4036-4048
[23]   Delta-n Hinge: rotation-invariant features for writer identification [J].
He, Sheng ;
Schomaker, Lambert .
2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, :2023-2028
[24]  
Hearn D., 1997, COMPUTER GRAPHICS C, V2
[25]   A text-independent Persian writer identification based on feature relation graph (FRG) [J].
Helli, Behzad ;
Moghaddam, Mohsen Ebrahimi .
PATTERN RECOGNITION, 2010, 43 (06) :2199-2209
[26]  
Howe N.R., PATTERN RECOGNIT, P42
[27]   Recovery of drawing order from single-stroke handwriting images [J].
Kato, Y ;
Yasuhara, M .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2000, 22 (09) :938-949
[28]  
Kohonen T., 1989, Self-organization and associative memory, V3rd
[29]   Convexity rule for shape decomposition based on discrete contour evolution [J].
Latecki, LJ ;
Lakämper, R .
COMPUTER VISION AND IMAGE UNDERSTANDING, 1999, 73 (03) :441-454
[30]   Style-aware Mid-level Representation for Discovering Visual Connections in Space and Time [J].
Lee, Yong Jae ;
Efros, Alexei A. ;
Hebert, Martial .
2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, :1857-1864