Training of an on-line handwritten Japanese character recognizer by artificial patterns

被引:9
作者
Chen, Bin [1 ]
Zhu, Bilan [1 ]
Nakagawa, Masaki [1 ]
机构
[1] Tokyo Univ Agr & Technol, Dept Comp & Informat Sci, Tokyo, Japan
基金
日本学术振兴会;
关键词
On-line handwriting recognition; Artificial patterns; Linear distortion models; Non-linear distortion model; Combined model; Pattern selection strategy;
D O I
10.1016/j.patrec.2012.07.012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents effects of a large amount of training patterns artificially generated to train an on-line handwritten Japanese character recognizer, which is based on the Markov Random Field model. In general, the more training patterns, the higher the recognition accuracy. In reality, however, the existing pattern samples are not enough, especially for languages with large sets of characters, for which a higher number of parameters needs to be adjusted. We use six types of linear distortion models and combine them among themselves and with a non-linear distortion model to generate a large amount of artificial patterns. These models are based on several geometry transform models, which are considered to simulate distortions in real handwriting. We apply these models to the TUAT Nakayosi database and expand its volume by up to 300 times while evaluating the notable effect of the TUAT Kuchibue database for improving recognition accuracy. The effect is analyzed for subgroups in the character set and a significant effect is observed for Kanji, ideographic characters of Chinese origin. This paper also considers the order of linear and non-linear distortion models and the strategy to select patterns in the original database from patterns close to character class models to those away from them or vice versa. For this consideration, we merge the Nakayosi and Kuchibue databases. We take 100 patterns existed in the merged database to form the testing set, while the remaining samples to form the training set. For the order, linear then non-linear distortions produce higher recognition accuracy. For the strategy, selecting patterns away from character class models to those close to them produce higher accuracy. (C) 2012 Elsevier B.V. All rights reserved.
引用
收藏
页码:178 / 185
页数:8
相关论文
共 22 条
[1]  
[Anonymous], P 10 INT WORKSH FRON
[2]  
[Anonymous], P ICDAR
[3]  
Chen B., 2010, CIN JAP KER JOINT WO, P90
[4]   Effects of Generating a Large Amount of Artificial Patterns for On-line Handwritten Japanese Character Recognition [J].
Chen, Bin ;
Zhu, Bilan ;
Nakagawa, Masaki .
11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, :663-667
[5]  
Cheng-Lin Liu, 2002, International Journal of Computer Processing of Oriental Languages, V15, P187, DOI 10.1142/S0219427902000595
[6]   Off-line, handwritten numeral recognition by perturbation method [J].
Ha, TM ;
Bunke, H .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1997, 19 (05) :535-539
[7]  
Jaeger S, 2001, PROC INT CONF DOC, P566, DOI 10.1109/ICDAR.2001.953853
[8]   Online handwriting recognition: The NPen++ recognizer [J].
Jaeger S. ;
Manke S. ;
Reichert J. ;
Waibel A. .
International Journal on Document Analysis and Recognition, 2001, Springer Verlag (03) :169-180
[9]   MODIFIED QUADRATIC DISCRIMINANT FUNCTIONS AND THE APPLICATION TO CHINESE CHARACTER-RECOGNITION [J].
KIMURA, F ;
TAKASHINA, K ;
TSURUOKA, S ;
MIYAKE, Y .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1987, 9 (01) :149-153
[10]  
Leung C. H., 1985, IEEE 1985 Proceedings of the International Conference on Cybernetics and Society (Cat. No.85CH2253-3), P38