Neural Error Corrective Language Models for Automatic Speech Recognition

被引:0
|
作者
Tanaka, Tomohiro [1 ]
Masumura, Ryo [1 ]
Masataki, Hirokazu [1 ]
Aono, Yushi [1 ]
机构
[1] NTT Corp, NTT Media Intelligence Labs, Tokyo, Japan
关键词
automatic speech recognition; language models; speech recognition error correction; conditional generative models;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present novel neural network based language models that can correct automatic speech recognition (ASR) errors by using speech recognizer output as a context. These models, called neural error corrective language models (NECLMs), utilizes ASR hypotheses of a target utterance as a context for estimating the generative probability of words. NECLMs are expressed as conditional generative models composed of an encoder network and a decoder network. In the models, the encoder network constructs context vectors from N-best lists and ASR confidence scores generated in a speech recognizer. The decoder network rescores recognition hypotheses by computing a generative probability of words using the context vectors so as to correct ASR errors. We evaluate the proposed models in Japanese lecture ASR tasks. Experimental results show that NECLM achieve better ASR performance than a state-of-the-art ASR system that incorporate a convolutional neural network acoustic model and a long short-term memory recurrent neural network language model.
引用
收藏
页码:401 / 405
页数:5
相关论文
共 50 条
  • [31] Error Corrective Fusion of Classifier Scores for Spoken Language Recognition
    Dehzangi, Omid
    Ma, Bin
    Chng, Eng Siong
    Li, Haizhou
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2011, E94D (12): : 2503 - 2512
  • [32] Pinyin Regularization in Error Correction for Chinese Speech Recognition with Large Language Models
    Tang, Zhiyuan
    Wang, Dong
    Huang, Shen
    Shang, Shidong
    INTERSPEECH 2024, 2024, : 1910 - 1914
  • [33] An Approach to Efficient Generation of High-Accuracy and Compact Error-Corrective Models for Speech Recognition
    Oba, Takanobu
    Hori, Takaaki
    Nakamura, Atsushi
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2676 - 2679
  • [34] Automatic Speech Recognition for Mixed Dialect Utterances by Mixing Dialect Language Models
    Hirayama, Naoki
    Yoshino, Koichiro
    Itoyama, Katsutoshi
    Mori, Shinsuke
    Okuno, Hiroshi G.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (02) : 373 - 382
  • [35] DYNAMIC ADJUSTMENT OF LANGUAGE MODELS FOR AUTOMATIC SPEECH RECOGNITION USING WORD SIMILARITY
    Currey, Anna
    Illina, Irina
    Fohr, Dominique
    2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 426 - 432
  • [36] Exploring the Effect of Dialect Mismatched Language Models in Telugu Automatic Speech Recognition
    Yadavalli, Aditya
    Mirishkar, Ganesh S.
    Vuppala, Anil Kumar
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP, 2022, : 292 - 301
  • [37] Word segments in category-based language models for automatic speech recognition
    Justo, Raquel
    Torres, M. Ines
    PATTERN RECOGNITION AND IMAGE ANALYSIS, PT 1, PROCEEDINGS, 2007, 4477 : 249 - +
  • [38] Graphical models and automatic speech recognition
    Bilmes, JA
    MATHEMATICAL FOUNDATIONS OF SPEECH AND LANGUAGE PROCESSING, 2004, 138 : 191 - 245
  • [39] First Automatic Fongbe Continuous Speech Recognition System: Development of Acoustic Models and Language Models
    LAleye, Frejus A. A.
    Besacier, Laurent
    Ezin, Eugene C.
    Motamed, Cina
    PROCEEDINGS OF THE 2016 FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS (FEDCSIS), 2016, 8 : 477 - 482
  • [40] PRELIMINARIES TO AUTOMATIC RECOGNITION OF SPEECH - LANGUAGE IDENTIFICATION
    HOUSE, AS
    NEUBERG, EP
    WOHLFORD, RE
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1975, 57 : S34 - S34