Statistical Machine Translation as a Language Model for Handwriting Recognition

被引:7
|
作者
Devlin, Jacob [1 ]
Kamali, Matin [1 ]
Subramanian, Krishna [1 ]
Prasad, Rohit [1 ]
Natarajan, Prem [1 ]
机构
[1] Raytheon BBN Technol, Cambridge, MA 02138 USA
来源
13TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR 2012) | 2012年
关键词
D O I
10.1109/ICFHR.2012.273
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
When performing handwriting recognition on natural language text, the use of a word-level language model (LM) is known to significantly improve recognition accuracy. The most common type of language model, the n-gram model, decomposes sentences into short, overlapping chunks. In this paper, we propose a new type of language model which we use in addition to the standard n-gram LM. Our new model uses the likelihood score from a statistical machine translation system as a reranking feature. In general terms, we automatically translate each OCR hypothesis into another language, and then create a feature score based on how "difficult" it was to perform the translation. Intuitively, the difficulty of translation correlates with how well-formed the input sentence is. In an Arabic handwriting recognition task, we were able to obtain an 0.4% absolute improvement to word error rate (WER) on top of a powerful 5-gram LM.
引用
收藏
页码:291 / 296
页数:6
相关论文
共 50 条
  • [21] A Topic-Triggered Translation Model for Statistical Machine Translation
    Su Jinsong
    Wang Zhihao
    Wu Qingqiang
    Yao Junfeng
    Long Fei
    Zhang Haiying
    CHINESE JOURNAL OF ELECTRONICS, 2017, 26 (01) : 65 - 72
  • [22] Model Transformation by Example with Statistical Machine Translation
    Berramla, Karima
    Deba, El Abbassia
    Wu, Jiechen
    Sahraoui, Houari
    Benyamina, Abou El Hassen
    PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON MODEL-DRIVEN ENGINEERING AND SOFTWARE DEVELOPMENT (MODELSWARD), 2020, : 76 - 83
  • [23] A syntactic transformation model for statistical machine translation
    Nguyen, Thai Phuong
    Shimazu, Akira
    Computer Processing of Oriental Languages, Proceedings: BEYOND THE ORIENT: THE RESEARCH CHALLENGES AHEAD, 2006, 4285 : 63 - 74
  • [24] An integrated reordering model for statistical machine translation
    Chao, Wen-Han
    Lie, Zhou-Jun
    Chen, Yue-Xin
    MICAI 2007: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2007, 4827 : 955 - +
  • [25] Factored Statistical Machine Translation System for English to Tamil Language
    Anand, Kumar M.
    Dhanalakshmi
    Soman, K. P.
    Rajendran, S.
    PERTANIKA JOURNAL OF SOCIAL SCIENCE AND HUMANITIES, 2014, 22 (04): : 1045 - 1061
  • [26] Towards incorporating language morphology into statistical machine translation systems
    Karageorgakis, P
    Potamianos, A
    Klasinas, I
    2005 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 2005, : 80 - 85
  • [27] Farsi - German statistical machine translation through bridge language
    Bakhshaei S.
    Khadivi S.
    Riahi N.
    2010 5th International Symposium on Telecommunications, IST 2010, 2010, : 557 - 561
  • [28] Linguistic Factors in Statistical Machine Translation Involving Arabic Language
    Youssef, Islam
    Sakr, Mohamed
    Kouta, Mohamed
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2009, 9 (11): : 154 - 159
  • [29] Analysis, preparation, and optimization of statistical sign language machine translation
    Stein, Daniel
    Schmidt, Christoph
    Ney, Hermann
    MACHINE TRANSLATION, 2012, 26 (04) : 325 - 357
  • [30] Statistical Relational Learning for Handwriting Recognition
    Shivram, Arti
    Khot, Tushar
    Natarajan, Sriraam
    Govindaraju, Venu
    INDUCTIVE LOGIC PROGRAMMING, ILP 2014, 2015, 9046 : 126 - 138