A benchmark image database of isolated Bangla handwritten compound characters

被引:44
|
作者
Das, Nibaran [1 ]
Acharya, Kallol [1 ]
Sarkar, Ram [1 ]
Basu, Subhadip [1 ]
Kundu, Mahantapas [1 ]
Nasipuri, Mita [1 ]
机构
[1] Jadavpur Univ, Comp Sci & Engn Dept, Kolkata 700032, India
关键词
OCR; Handwritten character recognition; Bangla Compound character; Benchmark database; SVM; RECOGNITION;
D O I
10.1007/s10032-014-0222-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the present work, we present a benchmark image database of isolated handwritten Bangla compound characters, used in the standard Bangla literature. A thorough survey over more than 2 million Bangla words has revealed that there exist around 334 compound characters in Bangla script. Of which, only around 171 character classes form unique pattern shapes, and some of these classes are often written in multiple styles. Altogether, 55,278 isolated character images, belonging to 199 different pattern shapes, are collected using three different data collection modalities. The database is divided into training and test sets in 4:1 ratio for each pattern class, by considering a balanced distribution of shapes from different modalities. A convex hull and quadtree-based feature set has been designed, and the test set recognition performance is reported with the support vector machine classifier. We have achieved a recognition accuracy of 79.35 % on the test database consisting of 171 character classes. The complete compound character image database is freely available as CMATERdb 3.1.3.3 from the website http://code.google.com/p/cmaterdb/, which may facilitate research on handwritten character recognition, especially related to Bangla form document processing systems.
引用
收藏
页码:413 / 431
页数:19
相关论文
共 37 条
  • [1] A benchmark image database of isolated Bangla handwritten compound characters
    Nibaran Das
    Kallol Acharya
    Ram Sarkar
    Subhadip Basu
    Mahantapas Kundu
    Mita Nasipuri
    International Journal on Document Analysis and Recognition (IJDAR), 2014, 17 : 413 - 431
  • [2] An Efficient Method for Improving Classification Accuracy of Handwritten Bangla Compound Characters using DCNN with Dropout and ELU
    Ashiquzzaman, Akm
    Tushar, Abdul Kawsar
    Dutta, Shantanu
    Mohsin, Farzana
    2017 THIRD IEEE INTERNATIONAL CONFERENCE ON RESEARCH IN COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (ICRCICN), 2017, : 147 - 152
  • [3] SUST-BHND: A database of Bangla handwritten numerals
    Razik, Shuvanon
    Hossain, Evan
    Ismail, Sabir
    Islam, Md Saiful
    2017 IEEE INTERNATIONAL CONFERENCE ON IMAGING, VISION & PATTERN RECOGNITION (ICIVPR), 2017,
  • [4] CMATERdb1: a database of unconstrained handwritten Bangla and Bangla-English mixed script document image
    Sarkar, Ram
    Das, Nibaran
    Basu, Subhadip
    Kundu, Mahantapas
    Nasipuri, Mita
    Basu, Dipak Kumar
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2012, 15 (01) : 71 - 83
  • [5] An Automated System for Recognizing Isolated Handwritten Bangla Characters using Deep Convolutional Neural Network
    Hasan, Md Nahid
    Ibn Sultan, Rafi
    Kasedullah, Mohammad
    11TH IEEE SYMPOSIUM ON COMPUTER APPLICATIONS & INDUSTRIAL ELECTRONICS (ISCAIE 2021), 2021, : 13 - 18
  • [6] An image database of handwritten Bangla words with automatic benchmarking facilities for character segmentation algorithms
    Malakar, Samir
    Sarkar, Ram
    Basu, Subhadip
    Kundu, Mahantapas
    Nasipuri, Mita
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (01): : 449 - 468
  • [7] A Database of Arabic Handwritten Characters
    Bahashwan, Mazen Abdullah
    Abu Bakar, Syed A.
    2014 IEEE INTERNATIONAL CONFERENCE ON CONTROL SYSTEM COMPUTING AND ENGINEERING, 2014, : 632 - 635
  • [8] Grouping of Handwritten Bangla Basic Characters, Numerals and Vowel Modifiers for Multilayer Classification
    Reza, Khondker Nayef
    Khan, Mumit
    13TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR 2012), 2012, : 325 - 330
  • [9] Wavepackets in the recognition of isolated handwritten characters
    Raju, G.
    Revathy, K.
    WORLD CONGRESS ON ENGINEERING 2007, VOLS 1 AND 2, 2007, : 635 - 638
  • [10] DBAHCL: database for Arabic handwritten characters and ligatures
    Lamghari N.
    Raghay S.
    International Journal of Multimedia Information Retrieval, 2017, 6 (3) : 263 - 269