Word-Level Script Identification from Handwritten Multi-script Documents

被引:5
|
作者
Singh, Pawan Kumar [1 ]
Mondal, Arafat [1 ]
Bhowmik, Showmik [1 ]
Sarkar, Ram [1 ]
Nasipuri, Mita [1 ]
机构
[1] Jadavpur Univ, Dept Comp Sci & Engn, Kolkata, India
来源
PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON FRONTIERS OF INTELLIGENT COMPUTING: THEORY AND APPLICATIONS (FICTA) 2014, VOL 1 | 2015年 / 327卷
关键词
Script identification; Handwritten Indic scripts; Texture based feature; Shape based feature; Multiple Classifiers;
D O I
10.1007/978-3-319-11933-5_62
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, a robust word-level handwritten script identification technique has been proposed. A combination of shape based and texture based features are used to identify the script of the handwritten word images written in any of five scripts namely, Bangla, Devnagari, Malayalam, Telugu and Roman. An 87-element feature set is designed to evaluate the present script recognition technique. The technique has been tested on 3000 handwritten words in which each script contributes about 600 words. Based on the identification accuracies of multiple classifiers, Multi Layer Perceptron (MLP) has been chosen as the best classifier for the present work. For 5-fold cross validation and epoch size of 500, MLP classifier produces the best recognition accuracy of 91.79% which is quite impressive considering the shape variations of the said scripts.
引用
收藏
页码:551 / 558
页数:8
相关论文
共 50 条
  • [1] A Texture based approach to Word-level Script Identification from Multi-script Handwritten Documents
    Singh, Pawan Kumar
    Khan, Aparajita
    Sarkar, Ram
    Nasipuri, Mita
    2014 6TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS, 2014, : 228 - 232
  • [2] Page-level Script Identification from Multi-script Handwritten Documents
    Singh, Pawan Kumar
    Dalal, Santu Kumar
    Sarkar, Ram
    Nasipuri, Mita
    2015 THIRD INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATION, CONTROL AND INFORMATION TECHNOLOGY (C3IT), 2015,
  • [3] Word-Level Thirteen Official Indic Languages Database for Script Identification in Multi-script Documents
    Obaidullah, Sk Md
    Santosh, K. C.
    Halder, Chayan
    Das, Nibaran
    Roy, Kaushik
    RECENT TRENDS IN IMAGE PROCESSING AND PATTERN RECOGNITION (RTIP2R 2016), 2017, 709 : 16 - 27
  • [4] Statistical comparison of classifiers for script identification from multi-script handwritten documents
    Singh, Pawan Kumar
    Sarkar, Ram
    Das, Nibaran
    Basu, Subhadip
    Nasipuri, Mita
    INTERNATIONAL JOURNAL OF APPLIED PATTERN RECOGNITION, 2014, 1 (02) : 152 - 172
  • [5] A Study on Word-Level Multi-script Identification from Video Frames
    Sharma, Nabin
    Pal, Umapada
    Blumenstein, Michael
    PROCEEDINGS OF THE 2014 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2014, : 1827 - 1833
  • [6] Word level multi-script identification
    Pati, Peeta Basa
    Ramakrishnan, A. G.
    PATTERN RECOGNITION LETTERS, 2008, 29 (09) : 1218 - 1229
  • [7] Script Identification of Multi-Script Documents: A Survey
    Ubul, Kurban
    Tursun, Gulzira
    Aysa, Alimjan
    Impedovo, Donato
    Pirlo, Giuseppe
    Yibulayin, Tuergen
    IEEE ACCESS, 2017, 5 : 6546 - 6559
  • [8] Word-level Script Identification for Handwritten Indic scripts
    Singh, Pawan Kumar
    Sarkar, Ram
    Nasipuri, Mita
    Doermann, David
    2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 1106 - 1110
  • [9] Separating Indic Scripts with matra for Effective Handwritten Script Identification in Multi-Script Documents
    Obaidullah, Sk Md
    Goswami, Chitrita
    Santosh, K. C.
    Das, Nibaran
    Halder, Chayan
    Roy, Kaushik
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2017, 31 (05)
  • [10] Identification of different script lines from multi-script documents
    Pal, U
    Chaudhuri, BB
    IMAGE AND VISION COMPUTING, 2002, 20 (13-14) : 945 - 954