Glyph-Based Data Augmentation for Accurate Kanji Character Recognition

被引:1
作者
Ofusa, Kenichiro [1 ]
Miyazaki, Tomo [1 ]
Sugaya, Yoshihiro [1 ]
Omachi, Shinichiro [1 ]
机构
[1] Tohoku Univ, Grad Sch Engn, Sendai, Miyagi, Japan
来源
2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1 | 2017年
关键词
Data augmentation; glyph; character recognition;
D O I
10.1109/ICDAR.2017.103
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we address a problem of data augmentation for character recognition. Particularly, we focus on incorporating variation in glyph into data augmentation of character images, which is a simple approach for data augmentation. Generally, existing methods increase data size by distorting images, whereas the proposed method applies noise injection into glyphs, resulting in data with radical variation in glyph. The proposed method exploits public database of glyphs for kanji and augments glyphs by injecting noise into glyphs. Then, we generate images of kanji automatically by deploying stroke images on the augmented glyphs. We carried out experiments for kanji character recognition using augmented data. The results show the effectiveness of the proposed method.
引用
收藏
页码:597 / 602
页数:6
相关论文
共 16 条
  • [1] [Anonymous], 2014, Advances in neural information processing systems
  • [2] [Anonymous], 2016, ICLR WORKSH
  • [3] [Anonymous], [No title captured]
  • [4] Bai BD, 2007, LECT NOTES COMPUT SC, V4613, P261
  • [5] Bayoudh S, 2007, LECT NOTES ARTIF INT, V4701, P527
  • [6] Hassan T, 2010, DOCENG2010: PROCEEDINGS OF THE 2010 ACM SYMPOSIUM ON DOCUMENT ENGINEERING, P181
  • [7] Parameterizable fonts based on shape components
    Hu, CY
    Hersch, RD
    [J]. IEEE COMPUTER GRAPHICS AND APPLICATIONS, 2001, 21 (03) : 70 - 85
  • [8] JAKUBIAK E. J., 2006, ACM SIGGRAPH 2006 SK, DOI DOI 10.1145/1179849.1180020
  • [9] Kamichi K., 2004, P GLYPH TYP WORKSH 2, P85
  • [10] Knuth D. E., 1986, THE METAFONTBOOK