Classification of Arabic Writer Based on Clustering Techniques

被引:7
作者
Ahmed, Ahmed Abdullah [1 ]
Al-Tamimi, Mohammed Sabbih [2 ]
Al-Sanjary, Omar Ismael [3 ]
Sulong, Ghazali [4 ]
机构
[1] Kurdistan Tech Inst, Dept Comp Sci, Sulaymaniyah Kurdista, Iraq
[2] Univ Baghdad, Dept Comp Sci, Coll Sci, Baghdad, Iraq
[3] Nawroz Univ Kurdistan Reg, Ctr Sci Res & Dev, Duhok, Iraq
[4] Univ Malaysia Terengganu, Sch Informat & Appl Math, Kuala Nerus 21030, Terengganu, Malaysia
来源
RECENT TRENDS IN INFORMATION AND COMMUNICATION TECHNOLOGY | 2018年 / 5卷
关键词
Clustering; Writer identification; Feature extraction; Feature combination; Distance measures; HANDWRITTEN DIGIT RECOGNITION; IDENTIFICATION;
D O I
10.1007/978-3-319-59427-9_6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Arabic text categorization for pattern recognitions is challenging. We propose for the first time a novel holistic method based on clustering for classifying Arabic writer. The categorization is accomplished stage-wise. Firstly, these document images are sectioned into lines, words, and characters. Secondly, their structural and statistical features are obtained from sectioned portions. Thirdly, F-Measure is used to evaluate the performance of the extracted features and their combination in different linkage methods for each distance measures and different numbers of groups. Finally, experiments are conducted on the standard KHATT dataset of Arabic handwritten text comprised of varying samples from 1000 writers. The results in the generation step are obtained from multiple runs of individual clustering methods for each distance measures. The best results are achieved when intensity, lines slope and their combination set of features are applied. It is demonstrated that different numbers of clusters having good set of features can deliver significant improvements for the handwritten structures clustering.
引用
收藏
页码:48 / 58
页数:11
相关论文
共 31 条
[1]  
[Anonymous], 1973, NUMERICAL TAXONOMY P
[2]  
BBC, 2009, GRAPH
[3]   A writer identification and verification system [J].
Bensefia, A ;
Paquet, T ;
Heutte, L .
PATTERN RECOGNITION LETTERS, 2005, 26 (13) :2080-2092
[4]  
Brook Salama, 2008, WSEAS Transactions on Information Science and Applications, V5, P1021
[5]  
Doermann D., 2003, UMD
[6]   GRAPHOLOGY AND PERSONALITY - ANOTHER FAILURE TO VALIDATE GRAPHOLOGICAL ANALYSIS [J].
FURNHAM, A ;
GUNTER, B .
PERSONALITY AND INDIVIDUAL DIFFERENCES, 1987, 8 (03) :433-435
[7]   Writer identification using global wavelet-based features [J].
He, Zhenyu ;
You, Xinge ;
Tang, Yuan Yan .
NEUROCOMPUTING, 2008, 71 (10-12) :1832-1841
[8]  
He ZY, 2004, PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, P3488
[9]  
Jardine N., 1971, MATH TAXONOMY
[10]  
Kane S., 2001, INDEXING GEORGE WASH