Arabic handwriting recognition using structural and syntactic pattern attributes

被引:79
作者
Parvez, Mohammad Tanvir [1 ]
Mahmoud, Sabri A. [2 ]
机构
[1] Qassim Univ, Dept Comp Engn, Qasim 51477, Saudi Arabia
[2] King Fahd Univ Petr & Minerals KFUPM, Informat & Comp Sci Dept, Dhahran 31261, Saudi Arabia
关键词
Arabic handwriting recognition; Structural recognition; Arabic OCR; Nearest neighbors; Median computation;
D O I
10.1016/j.patcog.2012.07.012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present research results on off-line Arabic handwriting recognition using structural techniques. Statistical methods have been more common in the reported research on Arabic handwriting recognition. Structural methods have remained largely unexplored in this regard. However, both statistical and structural techniques can be effectively integrated in multi-classifier based systems. This paper presents, to our knowledge, the first integrated offline Arabic handwritten text recognition system based on structural techniques. In implementing the system, several novel algorithms and techniques for structural recognition of Arabic handwriting are introduced. An Arabic text line is segmented into words/sub-words and dots are extracted. An adaptive slant correction algorithm that is able to correct the different slant angles of the different components of a text line is presented. A novel segmentation algorithm, which is integrated into the recognition phase, is designed based on the nature of Arabic writing and utilizes a polygonal approximation algorithm. This is followed by Arabic character modeling by 'fuzzy' polygons and later recognized using a novel fuzzy polygon matching algorithm. Dynamic programming is used to select best hypotheses of a sequence of recognized characters for each word/sub-word. In addition, several other key ideas, namely prototype selection using set-medians, lexicon reduction using dot-descriptors etc. are utilized to design a robust handwriting recognition system. Results are reported on the benchmarking IfN/ENIT database of Tunisian city names which indicate the robustness and the effectiveness of our system. The recognition rates are comparable to multi-classifier implementations and better than single classifier systems. (C) 2012 Elsevier Ltd. All rights reserved.
引用
收藏
页码:141 / 154
页数:14
相关论文
共 50 条
[1]   Arabic handwritten digit recognition [J].
Abdleazeem, Sherif ;
El-Sherif, Ezzat .
INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2008, 11 (03) :127-141
[2]   RECOGNITION OF HANDWRITTEN CURSIVE ARABIC CHARACTERS [J].
ABUHAIBA, ISI ;
MAHMOUD, SA ;
GREEN, RJ .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1994, 16 (06) :664-672
[3]  
[Anonymous], 2011, MARKOV MODELS HANDWR
[4]   NUCLEAR SPECTRAL-ANALYSIS VIA ARTIFICIAL NEURAL NETWORKS FOR WASTE HANDLING [J].
KELLER, PE ;
KANGAS, LJ ;
TROYER, GL ;
HASHEM, S ;
KOUZES, RT .
IEEE TRANSACTIONS ON NUCLEAR SCIENCE, 1995, 42 (04) :709-715
[5]   AN EFFICIENTLY COMPUTABLE METRIC FOR COMPARING POLYGONAL SHAPES [J].
ARKIN, EM ;
CHEW, LP ;
HUTTENLOCHER, DP ;
KEDEM, K ;
MITCHELL, JSB .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1991, 13 (03) :209-216
[6]   A multiple feature/resolution scheme to Arabic (Indian) numerals recognition using hidden Markov models [J].
Awaidah, Sameh M. ;
Mahmoud, Sabri A. .
SIGNAL PROCESSING, 2009, 89 (06) :1176-1184
[7]  
Azizi N, 2010, LECT NOTES COMPUT SC, V5997, P235, DOI 10.1007/978-3-642-12127-2_24
[8]  
Ben Cheikh I., 2008, 19 INT C PATT REC IC
[9]   Semi-continuous HMMs with explicit state duration for unconstrained Arabic word modeling and recognition [J].
Benouareth, A. ;
Ennaji, A. ;
Sellami, M. .
PATTERN RECOGNITION LETTERS, 2008, 29 (12) :1742-1752
[10]  
Bertolami R, 2007, PROC INT CONF DOC, P18