Error correction of semantic mathematical expressions based on bayesian algorithm

被引:0
作者
Wang, Xue [1 ,2 ]
Yang, Fang [1 ,2 ]
Liu, Hongyuan [1 ,2 ]
Shi, Qingxuan [1 ,2 ]
机构
[1] Hebei Univ, Sch Cyber Secur & Comp, Baoding 071002, Peoples R China
[2] Hebei Univ, Inst Intelligent Image & Document Informat Proc, Baoding 071002, Peoples R China
关键词
error correction; mathematical expressions; Bayesian algorithm; presentation MathML; content MathML; INFERENCE;
D O I
10.3934/mbe.2022255
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The semantic information of mathematical expressions plays an important role in information retrieval and similarity calculation. However, a large number of presentational expressions in the presentation MathML format contained in electronic scientific documents do not reflect semantic information. It is a shortcut to extract semantic information using the rule mapping method to convert presentational expressions in presentation MathML format into semantic expressions in the content MathML format. However, the conversion result is prone to semantic errors because the expressions in the two formats do not have exact correspondences in grammatical structures and markups. In this study, a Bayesian error correction algorithm is proposed to correct the semantic errors in the conversion results of mathematical expressions based on the rule mapping method. In this study, the expressions in presentation MathML and content MathML in the NTCIR data set are used as the training set to optimize the parameters of the Bayesian model. The expressions in presentation MathML in the documents collected by the laboratory from the CNKI website are used as the test set to test the error correction results. The experimental results show that the average F-1 value is 0.239 with the rule mapping method, and the average F-1 value is 0.881 with the Bayesian error correction method, with the average error correction rate is 0.853.
引用
收藏
页码:5428 / 5445
页数:18
相关论文
共 42 条
  • [11] Kando N., 2016, NTCIR NII TESTBEDS C
  • [12] Khan Jebran, 2020, [The Journal of Korean Institute of Communications and Information Sciences, 한국통신학회논문지], V45, P1027, DOI 10.7840/kics.2020.45.6.1027
  • [13] Chinese Grammatical Error Correction Based on Convolutional Sequence to Sequence Model
    Li, Si
    Zhao, Jianbo
    Shi, Guirong
    Tan, Yuanpeng
    Xu, Huifang
    Chen, Guang
    Lan, Haibo
    Lin, Zhiqing
    [J]. IEEE ACCESS, 2019, 7 : 72905 - 72913
  • [14] Liu F. Y., 2017, COMPUT SCI, V44, DOI [10.11896/j.issn.1002-137X.2017.05.052, DOI 10.11896/J.ISSN.1002-137X.2017.05.052]
  • [15] [刘洁 Liu Jie], 2020, [计算机工程, Computer Engineering], V46, P299
  • [16] Tangent-CFT: An Embedding Model for Mathematical Formulas
    Mansouri, Behrooz
    Rohatgi, Shaurya
    Oard, Douglas W.
    Wu, Jian
    Giles, C. Lee
    Zanibbi, Richard
    [J]. PROCEEDINGS OF THE 2019 ACM SIGIR INTERNATIONAL CONFERENCE ON THEORY OF INFORMATION RETRIEVAL (ICTIR'19), 2019, : 11 - 18
  • [17] Using MathML Parallel Markup Corpora for Semantic Enrichment of Mathematical Expressions
    Minh-Quoc Nghiem
    Yoko Kristianto, Giovanni
    Aizawa, Akiko
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2013, E96D (08) : 1707 - 1715
  • [18] Classifying MathML Expressions by Multilayer Perceptron
    Nagao, Yuma
    Suzuki, Nobutaka
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2018, E101D (07): : 1954 - 1958
  • [19] A Bayesian Inference Based Hybrid Recommender System
    Ngaffo, Armielle Noulapeu
    El Ayeb, Walid
    Choukair, Zied
    [J]. IEEE ACCESS, 2020, 8 : 101682 - 101701
  • [20] Probabilistic model updating via variational Bayesian inference and adaptive Gaussian process modeling
    Ni, Pinghe
    Li, Jun
    Hao, Hong
    Han, Qiang
    Du, Xiuli
    [J]. COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2021, 383 (383)