Statistical Machine Translation as a Language Model for Handwriting Recognition

被引:7
作者
Devlin, Jacob [1 ]
Kamali, Matin [1 ]
Subramanian, Krishna [1 ]
Prasad, Rohit [1 ]
Natarajan, Prem [1 ]
机构
[1] Raytheon BBN Technol, Cambridge, MA 02138 USA
来源
13TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR 2012) | 2012年
关键词
D O I
10.1109/ICFHR.2012.273
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
When performing handwriting recognition on natural language text, the use of a word-level language model (LM) is known to significantly improve recognition accuracy. The most common type of language model, the n-gram model, decomposes sentences into short, overlapping chunks. In this paper, we propose a new type of language model which we use in addition to the standard n-gram LM. Our new model uses the likelihood score from a statistical machine translation system as a reranking feature. In general terms, we automatically translate each OCR hypothesis into another language, and then create a feature score based on how "difficult" it was to perform the translation. Intuitively, the difficulty of translation correlates with how well-formed the input sentence is. In an Arabic handwriting recognition task, we were able to obtain an 0.4% absolute improvement to word error rate (WER) on top of a powerful 5-gram LM.
引用
收藏
页码:291 / 296
页数:6
相关论文
共 50 条
  • [41] Trends and challenges in language modeling for speech recognition and machine translation
    Schwenk, Holger
    2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 23 - 23
  • [42] Language Model Pre-training Method in Machine Translation Based on Named Entity Recognition
    Li, Zhen
    Qu, Dan
    Xie, Chaojie
    Zhang, Wenlin
    Li, Yanxia
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2020, 29 (7-8)
  • [43] STATISTICAL-MODEL FOR MACHINE PRINT RECOGNITION
    MILSON, TE
    RAO, KR
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1976, 6 (10): : 671 - 678
  • [44] Statistical machine translation
    Sanchez-Martinez, Felipe
    Antonio Perez-Ortiz, Juan
    MACHINE TRANSLATION, 2010, 24 (3-4) : 273 - 278
  • [45] Statistical Machine Translation
    Babhulgaonkar, A. R.
    Bharad, S. V.
    2017 1ST INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND INFORMATION MANAGEMENT (ICISIM), 2017, : 62 - 67
  • [46] Statistical Machine Translation
    Vatsa, Mukesh G. S.
    Joshi, Nikita
    Goswami, Sumit
    DESIDOC JOURNAL OF LIBRARY & INFORMATION TECHNOLOGY, 2010, 30 (04): : 25 - 32
  • [47] Statistical machine translation
    Lopez, Adam
    ACM COMPUTING SURVEYS, 2008, 40 (03)
  • [48] Phrase-boundary model for statistical machine translation
    Salami, Shahram
    Shamsfard, Mehrnoush
    Khadivi, Shahram
    COMPUTER SPEECH AND LANGUAGE, 2016, 38 : 13 - 27
  • [49] A syntactically informed reordering model for statistical machine translation
    Farzi, Saeed
    Faili, Heshaam
    Khadivi, Shahram
    JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2015, 27 (04) : 449 - 469
  • [50] Topic-aware pivot language approach for statistical machine translation
    Jinsong SU
    Xiaodong SHI
    Yanzhou HUANG
    Yang LIU
    Qingqiang WU
    Yidong CHEN
    Huailin DONG
    JournalofZhejiangUniversity-ScienceC(Computers&Electronics), 2014, 15 (04) : 241 - 253