MLP emulation of N-gram models as a first step to connectionist language modeling

被引：0

作者：

Castro, MJ ^{[1
]}

Prat, F ^{[1
]}

Casacuberta, F ^{[1
]}

机构：

[1] Univ Politecn Valencia, Dept Sistemes Informat & Computacio, Valencia, Spain

来源：

NINTH INTERNATIONAL CONFERENCE ON ARTIFICIAL NEURAL NETWORKS (ICANN99), VOLS 1 AND 2 | 1999年 / 470期

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In problems such as automatic speech recognition and machine translation, where the system response must be a sentence in a given language, language models are employed in order to improve system performance. These language models are usually N-gram models (for instance, bigram or trigram models) which are estimated from large text databases using the occurrence frequencies of these N-grams. In 1989, Nakamura and Shikano empirically showed how multilayer perceptrons can emulate trigram model predictive capabilities with additional generalization features. Our paper discusses Nakamura and Shikano's work, provides new empirical evidence on multilayer perceptron capability to emulate N-gram models, and proposes new directions for extending neural network-based language models. The experimental work we present here compares connectionist phonological bigram models with a conventional one using different measures, which include recognition performances in a Spanish acoustic-phonetic decoding task.

引用

页码：910 / 915

页数：6

共 50 条

[21] Variable-length category n-gram language models
Univ of Cambridge, Cambridge, United Kingdom
Comput Speech Lang, 1 (99-124):
[22] N-gram language models for offline handwritten text recognition
Zimmermann, M
Bunke, H
NINTH INTERNATIONAL WORKSHOP ON FRONTIERS IN HANDWRITING RECOGNITION, PROCEEDINGS, 2004, : 203 - 208
[23] Variable-length category n-gram language models
Niesler, TR
Woodland, PC
COMPUTER SPEECH AND LANGUAGE, 1999, 13 (01) : 99 - 124
[24] Language Identification of Short Text Segments with N-gram Models
Vatanen, Tommi
Vayrynen, Jaakko J.
Virpioja, Sami
LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 3423 - 3430
[25] Rich Morphology Based N-gram Language Models for Arabic
Emami, Ahmad
Zitouni, Imed
Mangu, Lidia
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 829 - 832
[26] Learning N-gram Language Models from Uncertain Data
Kuznetsov, Vitaly
Liao, Hank
Mohri, Mehryar
Riley, Michael
Roark, Brian
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2323 - 2327
[27] Similar N-gram Language Model
Gillot, Christian
Cerisara, Christophe
Langlois, David
Haton, Jean-Paul
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 1824 - 1827
[28] Croatian Language N-Gram System
Dembitz, Sandor
Blaskovic, Bruno
Gledec, Gordan
ADVANCES IN KNOWLEDGE-BASED AND INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, 2012, 243 : 696 - 705
[29] Semantic N-gram language modeling with the latent maximum entropy principle
Wang, SJ
Schuurmans, D
Peng, FC
Zhao, YX
2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 376 - 379
[30] A study of n-gram and decision tree letter language modeling methods
Potamianos, G
Jelinek, F
SPEECH COMMUNICATION, 1998, 24 (03) : 171 - 192

← 1 2 3 4 5 →