MLP emulation of N-gram models as a first step to connectionist language modeling

被引:0
|
作者
Castro, MJ [1 ]
Prat, F [1 ]
Casacuberta, F [1 ]
机构
[1] Univ Politecn Valencia, Dept Sistemes Informat & Computacio, Valencia, Spain
来源
NINTH INTERNATIONAL CONFERENCE ON ARTIFICIAL NEURAL NETWORKS (ICANN99), VOLS 1 AND 2 | 1999年 / 470期
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In problems such as automatic speech recognition and machine translation, where the system response must be a sentence in a given language, language models are employed in order to improve system performance. These language models are usually N-gram models (for instance, bigram or trigram models) which are estimated from large text databases using the occurrence frequencies of these N-grams. In 1989, Nakamura and Shikano empirically showed how multilayer perceptrons can emulate trigram model predictive capabilities with additional generalization features. Our paper discusses Nakamura and Shikano's work, provides new empirical evidence on multilayer perceptron capability to emulate N-gram models, and proposes new directions for extending neural network-based language models. The experimental work we present here compares connectionist phonological bigram models with a conventional one using different measures, which include recognition performances in a Spanish acoustic-phonetic decoding task.
引用
收藏
页码:910 / 915
页数:6
相关论文
共 50 条
  • [21] Variable-length category n-gram language models
    Univ of Cambridge, Cambridge, United Kingdom
    Comput Speech Lang, 1 (99-124):
  • [22] N-gram language models for offline handwritten text recognition
    Zimmermann, M
    Bunke, H
    NINTH INTERNATIONAL WORKSHOP ON FRONTIERS IN HANDWRITING RECOGNITION, PROCEEDINGS, 2004, : 203 - 208
  • [23] Variable-length category n-gram language models
    Niesler, TR
    Woodland, PC
    COMPUTER SPEECH AND LANGUAGE, 1999, 13 (01) : 99 - 124
  • [24] Language Identification of Short Text Segments with N-gram Models
    Vatanen, Tommi
    Vayrynen, Jaakko J.
    Virpioja, Sami
    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 3423 - 3430
  • [25] Rich Morphology Based N-gram Language Models for Arabic
    Emami, Ahmad
    Zitouni, Imed
    Mangu, Lidia
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 829 - 832
  • [26] Learning N-gram Language Models from Uncertain Data
    Kuznetsov, Vitaly
    Liao, Hank
    Mohri, Mehryar
    Riley, Michael
    Roark, Brian
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2323 - 2327
  • [27] Similar N-gram Language Model
    Gillot, Christian
    Cerisara, Christophe
    Langlois, David
    Haton, Jean-Paul
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 1824 - 1827
  • [28] Croatian Language N-Gram System
    Dembitz, Sandor
    Blaskovic, Bruno
    Gledec, Gordan
    ADVANCES IN KNOWLEDGE-BASED AND INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, 2012, 243 : 696 - 705
  • [29] Semantic N-gram language modeling with the latent maximum entropy principle
    Wang, SJ
    Schuurmans, D
    Peng, FC
    Zhao, YX
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 376 - 379
  • [30] A study of n-gram and decision tree letter language modeling methods
    Potamianos, G
    Jelinek, F
    SPEECH COMMUNICATION, 1998, 24 (03) : 171 - 192