An End-to-End System for Bangla Online Handwriting Recognition

被引:0
作者
Bhattacharya, Soumik [1 ]
Sen Maitra, Durjoy [1 ]
Bhattacharya, Ujjwal [1 ]
Parui, Swapan K. [1 ]
机构
[1] Indian Stat Inst, Comp Vis & Pattern Recognit Unit, Kolkata, India
来源
PROCEEDINGS OF 2016 15TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR) | 2016年
关键词
D O I
10.1109/ICFHR.2016.71
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A few studies of online Bangla handwriting recognition such as isolated character recognition or limited vocabulary cursive word recognition are found in the literature. However, development of an end-to-end recognition system of unconstrained online Bangla handwritten texts has not been duly attempted so far. In the present report, we describe a similar system which takes a piece of continuous online handwritten Bangla texts as the input. It first segments the input texts into individual lines, each line into its constituent words and each word into sub-strokes. In the present study, 152 different symbols which include basic characters, character modifiers, frequently used conjunct characters, a few special characters and numerals have been considered. The entire set of sub-strokes obtained from the training sample set has been exhaustively studied by 3 experts and 76 different shapes of sub-strokes have been identified based on consensus among these experts. Also, it has been observed that a character may produce at most 3 sub-strokes. Since a piece of Bangla texts often contains either Bangla or English numerals, the present character set consists of both the numeral set and 3 numeral shapes are common to both the scripts. The proposed recognition system uses two classifiers, one for characters and the other for sub-strokes. Sub-strokes are fed to the character classifier in their temporal order. A single sub-stroke followed by two consecutive sub-strokes and finally three successive sub strokes are passed to the character classifier and the first two top responses of the character classifier among the three cases are compared. If the difference is less than a threshold, the response of sub-stroke classifier is used to reach a final decision. The proposed system provided 94.3% character level accuracy on a test set consisting of 33,453 word samples written by 31 writers.
引用
收藏
页码:373 / 378
页数:6
相关论文
共 21 条
[1]  
[Anonymous], ACM T ASIAN LANG INF
[2]  
[Anonymous], 2008, P 11 INT C FRONT HAN
[3]   HMM-Based Lexicon-Driven and Lexicon-Free Word Recognition for Online Handwritten Indic Scripts [J].
Bharath, A. ;
Madhvanath, Sriganesh .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (04) :670-682
[4]  
Bharath A., 2014, ACM T ASIAN LANGUAGE, V13, P1
[5]  
Bhattacharya U, 2007, PROC INT CONF DOC, P511
[6]  
Bhattacharya U., 2008, P 11 INT C FRON HAND, P320
[7]  
Bhattacharya U., 2012, P INT C FRONT HANDWR, P676
[8]  
Fink Gernot A., 2010, Proceedings 2010 12th International Conference on Frontiers in Handwriting Recognition (ICFHR 2010), P393, DOI 10.1109/ICFHR.2010.68
[9]   OHRS-MEWA: On-line Handwriting Recognition System with Multi-Environment Writer Adaptation [J].
Haddad, Lobna ;
Hamdani, Tarek M. ;
Alimi, Adel M. .
2014 14TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2014, :335-340
[10]  
Jaeger S., 2004, INT J DOC ANAL RECOG, V6, P75