Automatic Transcription of Handwritten Medieval Documents

被引:54
作者
Fischer, Andreas [1 ]
Wuethrich, Markus [1 ]
Liwicki, Marcus [2 ]
Frinken, Volkmar [1 ]
Bunke, Horst [1 ]
Viehhauser, Gabriel [3 ]
Stolz, Michael [3 ]
机构
[1] Univ Bern, Inst Comp Sci & Appl Math, Neubruckstr 10, CH-3012 Bern, Switzerland
[2] German Res Ctr Artificial Intelligence DFKI, D-67663 Kaiserslautern, Germany
[3] Univ Bern, Inst Germanis, CH-3012 Bern, Switzerland
来源
2009 15TH INTERNATIONAL CONFERENCE ON VIRTUAL SYSTEMS AND MULTIMEDIA PROCEEDINGS (VSMM 2009) | 2009年
基金
瑞士国家科学基金会;
关键词
HISTORICAL DOCUMENTS; RECOGNITION; SYSTEM;
D O I
10.1109/VSMM.2009.26
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The automatic transcription of historical documents is vital for the creation of digital libraries. In order to make images of valuable old documents amenable to browsing, a transcription of high accuracy is needed. In this paper, two state-of-the art recognizers originally developed for modern scripts are applied to medieval documents. The first is based on Hidden Markov Models and the second uses a Neural Network with a bidirectional Long Short-Term Memory. On a dataset of word images extracted from a medieval manuscript of the 13th century, written in Middle High German by several writers, it is demonstrated that a word accuracy of 93.32% is achievable. This is far above the word accuracy of 77.12% achieved with the same recognizers for unconstrained modern scripts written in English. These results encourage the development of real world systems for automatic transcription of historical documents with a view to image and text browsing in digital libraries.
引用
收藏
页码:137 / +
页数:2
相关论文
共 22 条
[1]   Special issue on the analysis of historical documents [J].
Antonacopoulos, Apostolos ;
Downton, Andy C. .
INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2007, 9 (2-4) :75-77
[2]  
BOURGEOIS FL, 2007, INT J DOC ANAL RECOG, V9, P193
[3]  
BUNKE H, 2007, ADV PATTERN RECOGNIT, V20, P165
[4]  
FENG S, 2008, P IEEE INT C FRONT H
[5]   Industrial bank check processing: The A2iA CheckReader™ [J].
Gorski N. ;
Anisimov V. ;
Augustin E. ;
Baret O. ;
Maximov S. .
International Journal on Document Analysis and Recognition, 2001, 3 (04) :196-206
[6]  
GRAVES A, IEEE T PAMI IN PRESS
[7]  
GRAVES A, 2006, 23 INT C MACH LEARN, P369
[8]  
GUNTER S, 2003, 7 INT C DOC AN REC, V1, P472
[9]  
HAINDL M, 2007, LNCS, V4472
[10]  
*IEEE, 2006, 2 INT WORKSH DOC IM