Handwritten Word Image Categorization with Convolutional Neural Networks and Spatial Pyramid Pooling

被引:10
作者
Ignacio Toledo, J. [1 ]
Sudholt, Sebastian [3 ]
Fornes, Alicia [2 ]
Cucurull, Jordi [1 ]
Fink, Gernot A. [3 ]
Llados, Josep [2 ]
机构
[1] Scytl Secure Elect Voting, Barcelona, Spain
[2] Univ Autonoma Barcelona, Comp Vis Ctr, Barcelona, Spain
[3] TU Dortmund Univ, Dept Comp Sci, Dortmund, Germany
来源
STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, S+SSPR 2016 | 2016年 / 10029卷
关键词
Document image analysis; Word image categorization; Convolutional neural networks; Named entity detection;
D O I
10.1007/978-3-319-49055-7_48
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The extraction of relevant information from historical document collections is one of the key steps in order to make these documents available for access and searches. The usual approach combines transcription and grammars in order to extract semantically meaningful entities. In this paper, we describe a new method to obtain word categories directly from non-preprocessed handwritten word images. The method can be used to directly extract information, being an alternative to the transcription. Thus it can be used as a first step in any kind of syntactical analysis. The approach is based on Convolutional Neural Networks with a Spatial Pyramid Pooling layer to deal with the different shapes of the input images. We performed the experiments on a historical marriage record dataset, obtaining promising results.
引用
收藏
页码:543 / 552
页数:10
相关论文
共 20 条
[1]   Named Entity Recognition from Unstructured Handwritten Document Images [J].
Adak, Chandranath ;
Chaudhuri, Bidyut B. ;
Blumenstein, Michael .
PROCEEDINGS OF 12TH IAPR WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS, (DAS 2016), 2016, :375-380
[2]  
[Anonymous], ARXIV160400187
[3]   BH2M: the Barcelona Historical Handwritten Marriages database [J].
Fernandez-Mota, David ;
Almazan, Jon ;
Cirera, Nuria ;
Fornes, Alicia ;
Llados, Josep .
2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, :256-261
[4]  
Frinken V., 2014, Handbook of Document Image Processing and Recognition, P391
[5]  
Glorot X., 2011, P 14 INT C ARTIFICIA, P315
[6]   LEWIS: Latent Embeddings for Word Images and their Semantics [J].
Gordo, Albert ;
Almazan, Jon ;
Murray, Naila ;
Perronnin, Florent .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1242-1250
[7]   A Novel Connectionist System for Unconstrained Handwriting Recognition [J].
Graves, Alex ;
Liwicki, Marcus ;
Fernandez, Santiago ;
Bertolami, Roman ;
Bunke, Horst ;
Schmidhuber, Juergen .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2009, 31 (05) :855-868
[8]  
He KM, 2014, LECT NOTES COMPUT SC, V8691, P346, DOI [arXiv:1406.4729, 10.1007/978-3-319-10578-9_23]
[9]  
Hinton G. E., 2012, ABS12070580 CORR
[10]  
Hochreiter S, 1997, NEURAL COMPUT, V9, P1735, DOI [10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]