An efficient segmentation-free approach to assist old Greek handwritten manuscript OCR

被引:13
作者
Gatos, B [1 ]
Ntzios, K [1 ]
Pratikakis, I [1 ]
Petridis, S [1 ]
Konidaris, T [1 ]
Perantonis, S [1 ]
机构
[1] Natl Ctr Sci Res Demokritos, Inst Informat & Telecommun, Computat Intelligence Lab, Athens 15310, Greece
关键词
handwriting recognition; character recognition; segmentation-free; feature extraction; historical document recognition; old manuscript recognition;
D O I
10.1007/s10044-005-0013-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recognition of old Greek manuscripts is essential for quick and efficient content exploitation of the valuable old Greek historical collections. In this paper, we focus on the problem of recognizing early Christian Greek manuscripts written in lower case letters. Based on the existence of closed cavity regions in the majority of characters and character ligatures in these scripts, we propose a novel, segmentation-free, fast and efficient technique that assists the recognition procedure by tracing and recognizing the most frequently appearing characters or character ligatures. First, we detect closed cavities that exist in the character body. Then, the protrusions in the outer contour outline of the connected components that contain the character closed cavities are used for the classification of the area around closed cavities to a specific character or a character ligature. The proposed method gives highly accurate results and offers great assistance to old Greek handwritten manuscript OCR. We also provide additional OCR applications that not only prove the robustness of the proposed method but also demonstrate its generic flavor in case segmentation and text location tasks are very difficult to perform.
引用
收藏
页码:305 / 320
页数:16
相关论文
共 35 条
[1]  
AMIN A, 1982, SPIE, P286
[2]  
[Anonymous], 1999, MEDIATEAM DOCUMENT D
[3]  
Brakensiek A, 2003, PROC INT CONF DOC, P294
[4]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[5]  
CHEN CH, 2003, 2 INT C DOC AN REC I, P573
[6]  
CHEN CH, 1992, IEEE WORKSH APPL COM, P190
[7]   SEPARATION OF SINGLE-TOUCHING AND DOUBLE-TOUCHING HANDWRITTEN NUMERAL STRINGS [J].
CHI, Z ;
SUTERS, M ;
YAN, H .
OPTICAL ENGINEERING, 1995, 34 (04) :1159-1165
[8]  
Duda R. O., 2000, PATTERN CLASSIFICATI
[9]  
EASTWOOD B, 1997, INT C COMP INT MULT, P286
[10]  
FARAG RFH, 1979, IEEE T COMPUT, V28, P172, DOI 10.1109/TC.1979.1675310