Word-Level Script Identification from Handwritten Multi-script Documents

被引:5
|
作者
Singh, Pawan Kumar [1 ]
Mondal, Arafat [1 ]
Bhowmik, Showmik [1 ]
Sarkar, Ram [1 ]
Nasipuri, Mita [1 ]
机构
[1] Jadavpur Univ, Dept Comp Sci & Engn, Kolkata, India
来源
PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON FRONTIERS OF INTELLIGENT COMPUTING: THEORY AND APPLICATIONS (FICTA) 2014, VOL 1 | 2015年 / 327卷
关键词
Script identification; Handwritten Indic scripts; Texture based feature; Shape based feature; Multiple Classifiers;
D O I
10.1007/978-3-319-11933-5_62
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, a robust word-level handwritten script identification technique has been proposed. A combination of shape based and texture based features are used to identify the script of the handwritten word images written in any of five scripts namely, Bangla, Devnagari, Malayalam, Telugu and Roman. An 87-element feature set is designed to evaluate the present script recognition technique. The technique has been tested on 3000 handwritten words in which each script contributes about 600 words. Based on the identification accuracies of multiple classifiers, Multi Layer Perceptron (MLP) has been chosen as the best classifier for the present work. For 5-fold cross validation and epoch size of 500, MLP classifier produces the best recognition accuracy of 91.79% which is quite impressive considering the shape variations of the said scripts.
引用
收藏
页码:551 / 558
页数:8
相关论文
共 50 条
  • [11] Automatic Indic script identification from handwritten documents: page, block, line and word-level approach
    Sk Md Obaidullah
    K. C. Santosh
    Chayan Halder
    Nibaran Das
    Kaushik Roy
    International Journal of Machine Learning and Cybernetics, 2019, 10 : 87 - 106
  • [12] Automatic Indic script identification from handwritten documents: page, block, line and word-level approach
    Obaidullah, Sk Md
    Santosh, K. C.
    Halder, Chayan
    Das, Nibaran
    Roy, Kaushik
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2019, 10 (01) : 87 - 106
  • [13] Handwritten Indic Script Identification in Multi-Script Document Images: A Survey
    Obaidullah, Sk Md
    Santosh, K. C.
    Das, Nibaran
    Halder, Chayan
    Roy, Kaushik
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2018, 32 (10)
  • [14] Multi-script line identification from Indian documents
    Pal, U
    Sinha, S
    Chaudhuri, BB
    SEVENTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS I AND II, PROCEEDINGS, 2003, : 880 - 884
  • [15] Script line separation from Indian multi-script documents
    Pal, U
    Chaudhuri, BB
    IETE JOURNAL OF RESEARCH, 2003, 49 (01) : 3 - 11
  • [16] HVS inspired system for script identification in Indian multi-script documents
    Pati, PB
    Ramakrishnan, AG
    DOCUMENT ANALYSIS SYSTEMS VII, PROCEEDINGS, 2006, 3872 : 380 - 389
  • [17] Word-Level Script Identification from Scene Images
    Fasil, O. K.
    Manjunath, S.
    Aradhya, V. N. Manjunath
    PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON FRONTIERS IN INTELLIGENT COMPUTING: THEORY AND APPLICATIONS, (FICTA 2016), VOL 2, 2017, 516 : 417 - 426
  • [18] A blind indic script recognizer for multi-script documents
    Pati, Peeta Basa
    Ramakrishnan, A. G.
    ICDAR 2007: NINTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS I AND II, PROCEEDINGS, 2007, : 1248 - 1252
  • [19] Separating Indic Scripts with 'matra'-A Precursor to Script Identification in Multi-script Documents
    Obaidullah, Sk. Md.
    Goswami, Chitrita
    Santosh, K. C.
    Halder, Chayan
    Das, Nibaran
    Roy, Kaushik
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON COMPUTER VISION AND IMAGE PROCESSING, CVIP 2016, VOL 1, 2017, 459 : 205 - 214
  • [20] A generalized line segmentation method for multi-script handwritten text documents
    Rakshit, Payel
    Halder, Chayan
    Md Obaidullah, Sk
    Roy, Kaushik
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 212