Visual-Textual Attention for Tree-Based Handwritten Mathematical Expression Recognition

被引：0

作者：

Liao, Wei ^{[1
]}

Liu, Jiayi ^{[1
]}

Chen, Jianghan ^{[1
]}

Wang, Qiu-Feng ^{[1
]}

Huang, Kaizhu ^{[2
]}

机构：

[1] Xian Jiaotong Liverpool Univ, Sch Adv Technol, Suzhou, Peoples R China

[2] Duke Kunshan Univ, Data Sci Res Ctr, Suzhou, Peoples R China

来源：

ADVANCES IN BRAIN INSPIRED COGNITIVE SYSTEMS, BICS 2023 | 2024年 / 14374卷

基金：

中国国家自然科学基金;

关键词：

Handwritten mathematical expression recognition; Tree decoder; Visual-textual attention; Mutual learning; DECODER;

D O I：

10.1007/978-981-97-1417-9_35

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Handwritten mathematical expression recognition (HMER) has attracted much attention and achieved remarkable progress under the encoder-decoder framework. However, it is still challenging due to complex structures and illegible handwriting. In this paper, we propose to refine the encoder-decoder framework for HMER. Firstly, we propose a multi-scale vision and textual attention fusion mechanism to enhance the contexts from both spatial and semantic information. Next, most of HMER works simply regard the HMER as a sequence-to-sequence problem (i.e., Latex string), ignoring the structure information in the mathematical expressions. To overcome this issue, we utilize a tree decoder to capture such structure contexts. Furthermore, we propose a parent-children mutual learning method to enhance the learning of our encoder-decoder model. Extensive experiments on the HMER benchmark datasets of CROHME 2014, 2016 and 2019 demonstrate the effectiveness of the proposed method.

引用

页码：375 / 384

页数：10