A tree-based model with branch parallel decoding for handwritten mathematical expression recognition

被引:4
|
作者
Li, Zhe [1 ]
Yang, Wentao [1 ]
Qi, Hengnian [2 ]
Jin, Lianwen [1 ,4 ]
Huang, Yichao [3 ]
Ding, Kai [3 ]
机构
[1] South China Univ Technol, 381 wushan Rd, Guangzhou, Peoples R China
[2] Huzhou Univ, 759,Erhuandong Rd, Huzhou 313000, Peoples R China
[3] IntSig Informat Co, 1268,Wanrong Rd, Shanghai 200040, Peoples R China
[4] South China Univ Technol, Sch Elect & Informat, Guangzhou, Peoples R China
关键词
Handwritten mathematical expression; recognition; Tree-based model; Parallel decoding; Attention mechanism;
D O I
10.1016/j.patcog.2023.110220
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Handwritten mathematical expression recognition (HMER) is a challenging task in the field of computer vision due to the complex two-dimensional spatial structure and diverse handwriting styles of mathematical expressions (MEs). Recent mainstream approach treats MEs as objects with tree structures, modeled by sequence decoders or tree decoders. These decoders recognize the symbols and relationships between symbols in MEs in depth-first order, resulting in long decoding steps that can harm their performance, particularly for MEs with complex structures. In this paper, we propose a novel tree-based model with branch parallel decoding for HMER, which parses the structures of ME trees by explicitly predicting the relationships between symbols. In addition, a query constructing module is proposed to assist the decoder in decoding the branches of ME trees in parallel, thus reducing the number of decoding time steps and alleviating the problem of long sequence attention decoding. As a result, our model outperforms existing models on three widely-used benchmarks and demonstrates significant improvements in HMER performance.
引用
收藏
页数:9
相关论文
empty
未找到相关数据