Neural Error Corrective Language Models for Automatic Speech Recognition

被引:0
|
作者
Tanaka, Tomohiro [1 ]
Masumura, Ryo [1 ]
Masataki, Hirokazu [1 ]
Aono, Yushi [1 ]
机构
[1] NTT Corp, NTT Media Intelligence Labs, Tokyo, Japan
关键词
automatic speech recognition; language models; speech recognition error correction; conditional generative models;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present novel neural network based language models that can correct automatic speech recognition (ASR) errors by using speech recognizer output as a context. These models, called neural error corrective language models (NECLMs), utilizes ASR hypotheses of a target utterance as a context for estimating the generative probability of words. NECLMs are expressed as conditional generative models composed of an encoder network and a decoder network. In the models, the encoder network constructs context vectors from N-best lists and ASR confidence scores generated in a speech recognizer. The decoder network rescores recognition hypotheses by computing a generative probability of words using the context vectors so as to correct ASR errors. We evaluate the proposed models in Japanese lecture ASR tasks. Experimental results show that NECLM achieve better ASR performance than a state-of-the-art ASR system that incorporate a convolutional neural network acoustic model and a long short-term memory recurrent neural network language model.
引用
收藏
页码:401 / 405
页数:5
相关论文
共 50 条
  • [1] BIDIRECTIONAL RECURRENT NEURAL NETWORK LANGUAGE MODELS FOR AUTOMATIC SPEECH RECOGNITION
    Arisoy, Ebru
    Sethy, Abhinav
    Ramabhadran, Bhuvana
    Chen, Stanley
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5421 - 5425
  • [2] GEOGRAPHIC LANGUAGE MODELS FOR AUTOMATIC SPEECH RECOGNITION
    Xiao, Xiaoqiang
    Chen, Hong
    Zylak, Mark
    Sosa, Daniela
    Desu, Suma
    Krishnamoorthy, Mahesh
    Liu, Daben
    Paulik, Matthias
    Zhang, Yuchen
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6124 - 6128
  • [3] SEMANTIC WORD EMBEDDING NEURAL NETWORK LANGUAGE MODELS FOR AUTOMATIC SPEECH RECOGNITION
    Audhkhasi, Kartik
    Sethy, Abhinav
    Ramabhadran, Bhuvana
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5995 - 5999
  • [4] Latent Words Recurrent Neural Network Language Models for Automatic Speech Recognition
    Masumura, Ryo
    Asami, Taichi
    Oba, Takanobu
    Sakauchi, Sumitaka
    Ito, Akinori
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2019, E102D (12) : 2557 - 2567
  • [5] Error corrective mechanisms for speech recognition
    Mangu, L
    Padmanabhan, M
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 29 - 32
  • [6] Efficient Training and Evaluation of Recurrent Neural Network Language Models for Automatic Speech Recognition
    Chen, Xie
    Liu, Xunying
    Wang, Yongqiang
    Gales, Mark J. F.
    Woodland, Philip C.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (11) : 2146 - 2157
  • [7] A study of neural network Russian language models for automatic continuous speech recognition systems
    Kipyatkova, I. S.
    Karpov, A. A.
    AUTOMATION AND REMOTE CONTROL, 2017, 78 (05) : 858 - 867
  • [8] A study of neural network Russian language models for automatic continuous speech recognition systems
    I. S. Kipyatkova
    A. A. Karpov
    Automation and Remote Control, 2017, 78 : 858 - 867
  • [9] End-To-End deep neural models for Automatic Speech Recognition for Polish Language
    Pondel-Sycz, Karolina
    Pietrzak, Agnieszka Paula
    Szymla, Julia
    INTERNATIONAL JOURNAL OF ELECTRONICS AND TELECOMMUNICATIONS, 2024, 70 (02) : 315 - 321
  • [10] JOINT LANGUAGE MODELS FOR AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING
    Bayer, Ali Orkan
    Riccardi, Giuseppe
    2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 199 - 203