Hybrid OCR combination approach complemented by a specialized ICR applied on ancient documents

被引:12
作者
Cecotti, H [1 ]
Belaïd, A [1 ]
机构
[1] CNRS, LORIA, F-54506 Vandoeuvre les Nancy, France
来源
EIGHTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, PROCEEDINGS | 2005年
关键词
D O I
10.1109/ICDAR.2005.130
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In spite of the improvement of Commercial Optical Character Recognition (OCR) during the last years, their ability to process different kinds of documents can also be a default. They cannot produce a perfect recognition for all documents. However they allow producing high result for standard cases. We propose in this paper a model combining several OCRs and a specialized ICR (Intelligent Character Recognition) based on a convolutional neural network to complement them. Instead of just performing several OCRs in parallel and applying a fusing rule of the results, a specialized neural network with an adaptive topology is added to complement the OCRs in function of the OCRs errors. This system has been tested on ancient documents containing old characters and old fonts not used in contemporary documents. The OCRs combination increases the recognition of about 3% whereas the ICR improves the recognition of rejected characters of more than 5%.
引用
收藏
页码:1045 / 1049
页数:5
相关论文
共 15 条
[1]  
[Anonymous], INT J DOC ANAL RECOG
[2]  
[Anonymous], 2000, ARTIF INTELL
[3]  
Belaid A., 1994, Traitement du Signal, V11, P57
[4]  
DAMEREAU F, 1964, COMMUN ACM, V7, P649
[5]  
GUNES V, 2004, INT J PATTERN RECOGN, V17
[6]   METHOD OF COMBINING MULTIPLE EXPERTS FOR THE RECOGNITION OF UNCONSTRAINED HANDWRITTEN NUMERALS [J].
HUANG, YS ;
SUEN, CY .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1995, 17 (01) :90-94
[7]   Application of majority voting to pattern recognition: An analysis of its behavior and performance [J].
Lam, L ;
Suen, CY .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 1997, 27 (05) :553-568
[8]   Gradient-based learning applied to document recognition [J].
Lecun, Y ;
Bottou, L ;
Bengio, Y ;
Haffner, P .
PROCEEDINGS OF THE IEEE, 1998, 86 (11) :2278-2324
[9]   Using consensus sequence voting to correct OCR errors [J].
Lopresti, D ;
Zhou, JY .
COMPUTER VISION AND IMAGE UNDERSTANDING, 1997, 67 (01) :39-47
[10]   OCR correction based on document level knowledge [J].
Nartker, T ;
Taghva, K ;
Young, R ;
Borsack, J ;
Condit, A .
DOCUMENT RECOGNITION AND RETRIEVAL X, 2003, 5010 :103-110