Investigation on LSTM Recurrent N-gram Language Models for Speech Recognition

被引：10

作者：

Tueske, Zoltan ^{[1
,2
]}

Schlueter, Ralf ^{[1
]}

Ney, Hermann ^{[1
]}

机构：

[1] Rhein Westfal TH Aachen, Dept Comp Sci, Human Language Technol & Pattern Recognit, D-52056 Aachen, Germany

[2] IBM Res, Thomas J Watson Res Ctr, POB 704, Yorktown Hts, NY 10598 USA

来源：

19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES | 2018年

基金：

欧洲研究理事会;

关键词：

speech recognition; language-modeling; LSTM; n-gram; NEURAL-NETWORKS;

D O I：

10.21437/Interspeech.2018-2476

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recurrent neural networks (NN) with long short-term memory (LSTM) are the current state of the art to model long term dependencies. However, recent studies indicate that NN language models (LM) need only limited length of history to achieve excellent performance. In this paper, we extend the previous investigation on LSTM network based n-gram modeling to the domain of automatic speech recognition (ASR). First, applying recent optimization techniques and up to 6-layer LSTM networks, we improve LM perplexities by nearly 50% relative compared to classic count models on three different domains. Then, we demonstrate by experimental results that perplexities improve significantly only up to 40-grams when limiting the LM history. Nevertheless, the ASR performance saturates already around 20-grams despite across sentence modeling. Analysis indicates that the performance gain of LSTM NNLM over count models results only partially from the longer context and cross sentence modeling capabilities. Using equal context, we show that deep 4-gram LSTM can significantly outperform large interpolated count models by performing the backing off and smoothing significantly better. This observation also underlines the decreasing importance to combine state-of-the-art deep NNLM with count based model.

引用

页码：3358 / 3362

页数：5

共 50 条

[31] Constrained Discriminative Training of N-gram Language Models
Rastrow, Ariya
Sethy, Abhinav
Ramabhadran, Bhuvana
2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 311 - +
[32] Multilingual stochastic n-gram class language models
Jardino, M
1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 161 - 163
[33] POWER LAW DISCOUNTING FOR N-GRAM LANGUAGE MODELS
Huang, Songfang
Renals, Steve
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5178 - 5181
[34] N-gram language models for document image decoding
Kopec, GE
Said, MR
Popat, K
DOCUMENT RECOGNITION AND RETRIEVAL IX, 2002, 4670 : 191 - 202
[35] Bugram: Bug Detection with N-gram Language Models
Wang, Song
Chollak, Devin
Movshovitz-Attias, Dana
Tan, Lin
2016 31ST IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE), 2016, : 708 - 719
[36] PREDICTION OF LSTM-RNN FULL CONTEXT STATES AS A SUBTASK FOR N-GRAM FEEDFORWARD LANGUAGE MODELS
Irie, Kazuki
Lei, Zhihong
Schlueter, Ralf
Ney, Hermann
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6104 - 6108
[37] Importance of High-Order N-Gram Models in Morph-Based Speech Recognition
Hirsimaki, Teemu
Pylkkonen, Janne
Kurimo, Mikko
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (04): : 724 - 732
[38] N-gram and N-class models for on line handwriting recognition
Perraud, F
Viard-Gaudin, C
Morin, E
Lallican, PM
SEVENTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS I AND II, PROCEEDINGS, 2003, : 1053 - 1057
[39] Variable-length category n-gram language models
Univ of Cambridge, Cambridge, United Kingdom
Comput Speech Lang, 1 (99-124):
[40] N-gram Counts and Language Models from the Common Crawl
Buck, Christian
Heafield, Kenneth
van Ooyen, Bas
LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 3579 - 3584

← 1 2 3 4 5 →