Watch, attend and parse: An end-to-end neural network based approach to handwritten mathematical expression recognition

被引:152
作者
Zhang, Jianshu [1 ]
Du, Jun [1 ]
Zhang, Shiliang [1 ]
Liu, Dan [2 ]
Hu, Yulong [2 ]
Hu, Jinshui [2 ]
Wei, Si [2 ]
Dai, Lirong [1 ]
机构
[1] Univ Sci & Technol China, Natl Engn Lab Speech & Language Informat Proc, Hefei, Anhui, Peoples R China
[2] IFLYTEK Res, Hefei, Anhui, Peoples R China
基金
中国国家自然科学基金;
关键词
Handwritten mathematical expression; recognition; Neural network; Attention; FEATURES;
D O I
10.1016/j.patcog.2017.06.017
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Machine recognition of a handwritten mathematical expression (HME) is challenging due to the ambiguities of handwritten symbols and the two-dimensional structure of mathematical expressions. Inspired by recent work in deep learning, we present Watch, Attend and Parse (WAP), a novel end-to-end approach based on neural network that learns to recognize HMEs in a two-dimensional layout and outputs them as one-dimensional character sequences in LaTeX format. Inherently unlike traditional methods, our proposed model avoids problems that stem from symbol segmentation, and it does not require a predefined expression grammar. Meanwhile, the problems of symbol recognition and structural analysis are handled, respectively, using a watcher and a parser. We employ a convolutional neural network encoder that takes HME images as input as the watcher and employ a recurrent neural network decoder equipped with an attention mechanism as the parser to generate LaTeX sequences. Moreover, the correspondence between the input expressions and the output LaTeX sequences is learned automatically by the attention mechanism. We validate the proposed approach on a benchmark published by the CROHME international competition. Using the official training dataset, WAP significantly outperformed the state-of-the-art method with an expression recognition accuracy of 46.55% on CROHME 2014 and 44.55% on CROHME 2016. (C) 2017 Elsevier Ltd. All rights reserved.
引用
收藏
页码:196 / 206
页数:11
相关论文
共 62 条
[21]  
[Anonymous], 3 INT C LEARN REPR S
[22]   A global learning approach for an online handwritten mathematical expression recognition system [J].
Awal, Ahmad-Montaser ;
Mouchere, Harold ;
Viard-Gaudin, Christian .
PATTERN RECOGNITION LETTERS, 2014, 35 :68-77
[23]  
Bai ZL, 2005, PROC INT CONF DOC, P262
[24]  
Bandanau D, 2016, INT CONF ACOUST SPEE, P4945, DOI 10.1109/ICASSP.2016.7472618
[25]   A SYNTACTIC APPROACH FOR HANDWRITTEN MATHEMATICAL FORMULA RECOGNITION [J].
BELAID, A ;
HATON, JP .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1984, 6 (01) :105-111
[26]   Error detection, error correction and performance evaluation in on-line mathematical expression recognition [J].
Chan, KF ;
Yeung, DY .
PATTERN RECOGNITION, 2001, 34 (08) :1671-1684
[27]  
Cho K., 2014, P 2014 C EMP METH NA, P1724, DOI [DOI 10.3115/V1/D14-1179, 10.3115/v1/D14-1179]
[28]  
Chorowski J., NIPS 2014 WORKSHOP D
[29]  
Chou P. A., 1989, Proceedings of the SPIE - The International Society for Optical Engineering, V1199, P852, DOI 10.1117/12.970095
[30]  
Chung Junyoung, 2014, EMPIRICAL EVALUATION