Lattice Decoding and Rescoring with Long-Span Neural Network Language Models

被引:0
作者
Sundermeyer, Martin [1 ]
Tueske, Zoltcin [1 ]
Schlueter, Ralf [1 ]
Ney, Hermann [1 ,2 ]
机构
[1] Rhein Westfal TH Aachen, Human Language Technol & Pattern Recognit, Dept Comp Sci, Aachen, Germany
[2] LIMSI CNRS, Spoken Language Proc Grp, Paris, France
来源
15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4 | 2014年
关键词
speech recognition; language modeling; recurrent neural networks; long short-term memory; word lattices;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With long-span neural network language models, considerable improvements have been obtained in speech recognition. However, it is difficult to apply these models if the underlying search space is large. In this paper, we combine previous work on lattice decoding with long short-term memory (LSTM) neural network language 'models. By adding refined pruning techniques, we are able to reduce the search effort by a factor of three. Furthermore, we introduce two novel approximations for full lattice rescoring, which opens the potential of lattice-based speech recognition techniques. Compared to 1000-best lists, we find that we can increase the word error rate improvements obtained with LSTMs from 8.2 % to 10.7 % relative over a stateof-the-art baseline, while the resulting lattices are even considerably smaller. In addition, we investigate the use of LSTMs for Babel Assamese keyword search, obtaining significant improvements of 2.5 % relative.
引用
收藏
页码:661 / 665
页数:5
相关论文
共 27 条
[1]  
Ansoy E., P NAACL HLT 2012 WOR, P20
[2]  
Auli M., P EMNLP 2013, P1044
[3]  
Bengio Y, 2001, ADV NEUR IN, V13, P932
[4]  
Bisani M., P ICASSP 2004, P409
[5]  
Brown P. F, COMPUTATIONAL LINGUI, V18, P467
[6]  
Deoras A., P EMLNP 2011, P1116
[7]  
Goodman J., P ICASSP 2001, P561
[8]  
Graves A., P ICASSP 2013, P6645
[9]  
Hermansky H., P ICASSP 2000, P1635
[10]  
Hochreiter S, 1997, NEURAL COMPUT, V9, P1735, DOI [10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]