WRITER IDENTIFICATION OF ARABIC TEXT USING STATISTICAL AND STRUCTURAL FEATURES

被引:24
|
作者
Awaida, Sameh M. [1 ]
Mahmoud, Sabri A. [2 ]
机构
[1] Qassim Univ, Qasim, Saudi Arabia
[2] King Fahd Univ Petr & Minerals, Dept Informat & Comp Sci, Dhahran 31261, Saudi Arabia
关键词
Arabic writer identification system; feature combination; feature extraction; handwriting analysis; handwritten text; text-independent writer identification; VERIFICATION;
D O I
10.1080/01969722.2012.732802
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This article addresses writer identification of handwritten Arabic text. Several types of structural and statistical features were extracted from Arabic handwriting text. A novel approach was used to extract structural features that build on some of the main characteristics of the Arabic language. Connected component features for Arabic handwritten text as well as gradient distribution features, windowed gradient distribution features, contour chain code distribution features, and windowed contour chain code distribution features were extracted. A nearest neighbor (NN) classifier was used with the Euclidean distance measure. Data reduction algorithms (viz. principal component analysis [PCA], linear discriminant analysis [LDA], multiple discriminant analysis [MDA], multidimensional scaling [MDS], and forward/backward feature selection algorithm) were used. A database of 500 paragraphs handwritten in Arabic by 250 writers was used. The paragraphs used were randomly generated from a large corpus. NN provided the best accuracy in text-independent writer identification with top-1 result of 88.0%, top-5 result of 96.0%, and top-10 result of 98.5% for the first 100 writers. Extending the work to include all 250 writers and with the backward feature selection algorithm (using 54 out of 83 features), the system attained a top-1 result of 75.0%, top-5 result of 91.8%, and top-10 result of 95.4%.
引用
收藏
页码:57 / 76
页数:20
相关论文
共 50 条
  • [21] Separation between Arabic and Latin Scripts from Bilingual Text Using Structural Features
    Haboubi, Sofiene
    Maddouri, Samia Snoussi
    Amiri, Hamid
    INTEGRATED COMPUTING TECHNOLOGY, 2011, 165 : 132 - 143
  • [22] An Effective Combination of MPP Contour-Based Features for Off-Line Text-Independent Arabic Writer Identification
    Abdi, Mohamed Nidhal
    Khemakhem, Maher
    Ben-Abdallah, Hanene
    SIGNAL PROCESSING, IMAGE PROCESSING, AND PATTERN RECOGNITION, 2009, 61 : 209 - 220
  • [23] A New Text-Independent GMM Writer Identification System Applied to Arabic Handwriting
    Slimane, Fouad
    Maergner, Volker
    2014 14TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2014, : 708 - 713
  • [24] Offline Text-independent Writer Identification Using Stroke Fragment and Contour Based Features
    Tang, Youbao
    Wu, Xiangqian
    Bu, Wei
    2013 INTERNATIONAL CONFERENCE ON BIOMETRICS (ICB), 2013,
  • [25] Local fragment distribution features for text-independent writer identification
    Hong, D., 1600, Trade Science Inc, 126,Prasheel Park,Sanjay Raj Farm House,Nr. Saurashtra Unive, Rajkot, Gujarat, 360 005, India (08):
  • [26] Writer Identification Using Edge Based Features
    Fan, Zhenyin
    Guo, Zhenhua
    Chen, Youbin
    PROCEEDINGS 3RD IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION ACPR 2015, 2015, : 416 - 420
  • [27] On writer identification for Arabic historical manuscripts
    Asi, Abedelkadir
    Abdalhaleem, Alaa
    Fecker, Daniel
    Maergner, Volker
    El-Sana, Jihad
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2017, 20 (03) : 173 - 187
  • [28] On writer identification for Arabic historical manuscripts
    Abedelkadir Asi
    Alaa Abdalhaleem
    Daniel Fecker
    Volker Märgner
    Jihad El-Sana
    International Journal on Document Analysis and Recognition (IJDAR), 2017, 20 : 173 - 187
  • [29] Writer Identification for Historical Arabic Documents
    Fecker, Daniel
    Asi, Abedelkadir
    Maergner, Volker
    El-Sana, Jihad
    Fingscheidt, Tim
    2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 3050 - 3055
  • [30] New Online/Offline text-dependent Arabic Handwriting dataset for Writer Authentication and Identification
    Al-Shamaileh, Mohammad Z.
    Hassanat, Ahmad B.
    Tarawneh, Ahmad S.
    Rahman, M. Sohel
    Celik, Ceyhun
    Jawthari, Moohanad
    2019 10TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION SYSTEMS (ICICS), 2019, : 116 - 121