Comparison of Various Neural Network Language Models in Speech Recognition

被引:1
|
作者
Zuo, Lingyun [1 ]
Liu, Jian [1 ,2 ]
Wan, Xin [3 ]
机构
[1] IACAS, Key Lab Speech Acoust & Content, Beijing, Peoples R China
[2] Chinese Acad Sci, XTIPC, Xinjiang Lab Minor Speech & Language Informat Pro, Beijing, Peoples R China
[3] Natl Comp Network Emergency Response Tech Team, Coordinat Ctr, Beijing, Peoples R China
关键词
neural network language model; LSTM; speech recognition; n-best lists re-score;
D O I
10.1109/ICISCE.2016.195
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, research on language modeling for speech recognition has increasingly focused on the application of neural networks. However, the performance of neural network language models strongly depends on their architectural structure. Three competing concepts have been developed: Firstly, feed forward neural networks representing an n-gram approach; Secondly, recurrent neural networks that may learn context dependencies spanning more than a fixed number of predecessor words; Thirdly, the long short-term memory (LSTM) neural networks can fully exploits the correlation on a telephone conversation corpus. In this paper, we compare count models to feed forward, recurrent, and LSTM neural network in conversational telephone speech recognition tasks. Furthermore, we put forward a language model estimation method introduced the information of history sentences. We evaluate the models in terms of perplexity and word error rate, experimentally validating the strong correlation of the two quantities, which we find to hold regardless of the underlying type of the language model. The experimental results show that the performance of LSTM neural network language model is optimal in n-best lists rescore. Compared to the first pass decoding, the relative decline in average word error rate is 4.3% when using ten candidate results to re-score in conversational telephone speech recognition tasks.
引用
收藏
页码:894 / 898
页数:5
相关论文
共 50 条
  • [31] Fast and Accurate Recurrent Neural Network Acoustic Models for Speech Recognition
    Sak, Hasim
    Senior, Andrew
    Rao, Kanishka
    Beaufays, Francoise
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1468 - 1472
  • [32] Comparison on Neural Network Based Acoustic Model in Mongolian Speech Recognition
    Zhang, Hongwei
    Bao, Feilong
    Gao, Guanglai
    Zhang, Hui
    PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2016, : 1 - 5
  • [33] Integrating Prosodic Information into Recurrent Neural Network Language Model For Speech Recognition
    Fu, Tong
    Han, Yang
    Li, Xiangang
    Liu, Yi
    Wu, Xihong
    2015 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2015, : 1194 - 1197
  • [34] Audiovisual speech recognition for Kannada language using feed forward neural network
    Shashidhar, R.
    Patilkulkarni, S.
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (18): : 15603 - 15615
  • [35] Audiovisual speech recognition for Kannada language using feed forward neural network
    R. Shashidhar
    S. Patilkulkarni
    Neural Computing and Applications, 2022, 34 : 15603 - 15615
  • [36] RECURRENT NEURAL NETWORK LANGUAGE MODEL WITH STRUCTURED WORD EMBEDDINGS FOR SPEECH RECOGNITION
    He, Tianxing
    Xiang, Xu
    Qian, Yanmin
    Yu, Kai
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5396 - 5400
  • [37] Comparison Of Language Models Trained On Written Texts And Speech Transcripts In The Context Of Automatic Speech Recognition
    Dziadzio, Sebastian
    Nabozny, Aleksandra
    Smywinski-Pohl, Aleksander
    Ziolko, Bartosz
    PROCEEDINGS OF THE 2015 FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2015, 5 : 193 - 197
  • [38] Improving handwritten Chinese text recognition using neural network language models and convolutional neural network shape models
    Wu, Yi-Chao
    Yin, Fei
    Liu, Cheng-Lin
    PATTERN RECOGNITION, 2017, 65 : 251 - 264
  • [39] Neural network approach to speech recognition
    Lee, Y.C.
    Chen, H.H.
    Sun, G.Z.
    Neural Networks, 1988, 1 (1 SUPPL)
  • [40] Quantum neural network in speech recognition
    Fei, L
    Zhao, SM
    Zheng, BY
    2002 6TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I AND II, 2002, : 1267 - 1270