WRITER IDENTIFICATION OF ARABIC TEXT USING STATISTICAL AND STRUCTURAL FEATURES

被引:24
|
作者
Awaida, Sameh M. [1 ]
Mahmoud, Sabri A. [2 ]
机构
[1] Qassim Univ, Qasim, Saudi Arabia
[2] King Fahd Univ Petr & Minerals, Dept Informat & Comp Sci, Dhahran 31261, Saudi Arabia
关键词
Arabic writer identification system; feature combination; feature extraction; handwriting analysis; handwritten text; text-independent writer identification; VERIFICATION;
D O I
10.1080/01969722.2012.732802
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This article addresses writer identification of handwritten Arabic text. Several types of structural and statistical features were extracted from Arabic handwriting text. A novel approach was used to extract structural features that build on some of the main characteristics of the Arabic language. Connected component features for Arabic handwritten text as well as gradient distribution features, windowed gradient distribution features, contour chain code distribution features, and windowed contour chain code distribution features were extracted. A nearest neighbor (NN) classifier was used with the Euclidean distance measure. Data reduction algorithms (viz. principal component analysis [PCA], linear discriminant analysis [LDA], multiple discriminant analysis [MDA], multidimensional scaling [MDS], and forward/backward feature selection algorithm) were used. A database of 500 paragraphs handwritten in Arabic by 250 writers was used. The paragraphs used were randomly generated from a large corpus. NN provided the best accuracy in text-independent writer identification with top-1 result of 88.0%, top-5 result of 96.0%, and top-10 result of 98.5% for the first 100 writers. Extending the work to include all 250 writers and with the backward feature selection algorithm (using 54 out of 83 features), the system attained a top-1 result of 75.0%, top-5 result of 91.8%, and top-10 result of 95.4%.
引用
收藏
页码:57 / 76
页数:20
相关论文
共 50 条
  • [41] Robust off-line text independent writer identification using bagged discrete cosine transform features
    Khan, Faraz Ahmad
    Tahir, Muhammad Atif
    Khelifi, Fouad
    Bouridane, Ahmed
    Almotaeryi, Resheed
    EXPERT SYSTEMS WITH APPLICATIONS, 2017, 71 : 404 - 415
  • [42] WRITER, CULTURE, TEXT - STUDIES IN MODERN ARABIC LITERATURE - ELAD,A
    BOULLATA, IJ
    WORLD LITERATURE TODAY, 1994, 68 (01) : 202 - 203
  • [43] Text and Script Independent Writer Identification
    Dhandra, B. V.
    Vijayalaxmi, M. B.
    2014 INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING AND INFORMATICS (IC3I), 2014, : 586 - 590
  • [44] Text-Independent Speaker Identification Using Arabic Phonemes
    Alarjani, Samiha R.
    Rao, Imran
    Fatima, Iram
    Ahmad, Hafiz Farooq
    JOURNAL OF ADVANCES IN INFORMATION TECHNOLOGY, 2025, 16 (03) : 330 - 341
  • [45] WRITER, CULTURE, TEXT - STUDIES IN MODERN ARABIC LITERATURE - ELAD,A
    BADAWI, MM
    MIDDLE EASTERN STUDIES, 1994, 30 (03) : 687 - 691
  • [46] Arabic Writer Identification Using Local Binary Patterns (LBP) of Handwritten Fragments
    Hannad, Yaacoub
    Siddiqi, Imran
    El Kettani, Mohamed El Youssfi
    PATTERN RECOGNITION AND IMAGE ANALYSIS (IBPRIA 2015), 2015, 9117 : 237 - 244
  • [47] Paraphrase identification and semantic text similarity analysis in Arabic news tweets using lexical, syntactic, and semantic features
    Al-Smadi, Mohammad
    Jaradat, Zain
    Al-Ayyoub, Mahmoud
    Jararweh, Yaser
    INFORMATION PROCESSING & MANAGEMENT, 2017, 53 (03) : 640 - 652
  • [48] Discriminating Features for Writer Identification
    Daniels, Zachary A.
    Baird, Henry S.
    2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2013, : 1385 - 1389
  • [49] Using Word N-Grams as Features in Arabic Text Classification
    Al-Thubaity, Abdulmohsen
    Alhoshan, Muneera
    Hazzaa, Itisam
    SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING, 2015, 569 : 35 - 43
  • [50] An efficient personal identification by using text-independent writer recognition
    Huang, YP
    Luo, SW
    7TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL V, PROCEEDINGS: COMPUTER SCIENCE AND ENGINEERING: I, 2003, : 349 - 351