Unsupervised Language Model Adaptation by Data Selection for Speech Recognition

被引:2
|
作者
Khassanov, Yerbolat [1 ]
Chong, Tze Yuang [1 ]
Bigot, Benjamin [1 ]
Chng, Eng Siong [1 ]
机构
[1] Nanyang Technol Univ, Rolls Royce NTU Corp Lab, Singapore, Singapore
基金
新加坡国家研究基金会;
关键词
Language model adaptation; Unsupervised adaptation; Data selection; Speech recognition;
D O I
10.1007/978-3-319-54472-4_48
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present a language model (LM) adaptation framework based on data selection to improve the recognition accuracy of automatic speech recognition systems. Previous approaches of LM adaptation usually require additional data to adapt the existing background LM. In this work, we propose a novel two-pass decoding approach that uses no additional data, but instead, selects relevant data from the existing background corpus that is used to train the background LM. The motivation is that the background corpus consists of data from the different domains and as such, the LM trained from it is generic and not discriminative. To make the LM more discriminative, we will select sentences from the background corpus that are similar in some linguistic characteristics to the utterances recognized in the first-pass and use them to train a new LM which is employed during the second-pass decoding. In this work, we examine the use of n-gram and bag-of-words features as linguistic characteristics of selection criteria. Evaluated on the 11 talks in the test-set of TED-LIUM corpus, the proposed adaptation framework produced a LM that reduced the word error rate by up to 10% relatively and the perplexity by up to 47% relatively. When the LM was adapted for each talk individually, further word error rate reduction was achieved.
引用
收藏
页码:508 / 517
页数:10
相关论文
共 50 条
  • [21] Speech selection and environmental adaptation for asynchronous speech recognition
    Ren, Bo
    Wang, Longbiao
    Kai, Atsuhiko
    Zhang, Zhaofeng
    2015 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2015, : 119 - 124
  • [22] On the Adaptation of Foreign Language Speech Recognition Engines for Lithuanian Speech Recognition
    Rudzionis, Vytautas
    Maskeliunas, Rytis
    Rudzionis, Algimantas
    Ratkevicius, Kastytis
    BUSINESS INFORMATION SYSTEMS WORKSHOPS, 2009, 37 : 113 - +
  • [23] Attention-based Contextual Language Model Adaptation for Speech Recognition
    Martinez, Richard Diehl
    Novotney, Scott
    Bulyko, Ivan
    Rastrow, Ariya
    Stolcke, Andreas
    Gandhe, Ankur
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 1994 - 2003
  • [24] Recurrent Neural Network Language Model Adaptation for Conversational Speech Recognition
    Li, Ke
    Xu, Hainan
    Wang, Yiming
    Povey, Daniel
    Khudanpur, Sanjeev
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3373 - 3377
  • [25] Language model and speaking rate adaptation for spontaneous presentation speech recognition
    Nanjo, H
    Kawahara, T
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2004, 12 (04): : 391 - 400
  • [26] Efficient Language Model Adaptation for Automatic Speech Recognition of Spoken Translations
    Pelemans, Joris
    Vanallemeersch, Tom
    Demuynck, Kris
    Van Hamme, Hugo
    Wambacq, Patrick
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2262 - 2266
  • [27] Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition
    Meng, Zhong
    Gaur, Yashesh
    Kanda, Naoyuki
    Li, Jinyu
    Chen, Xie
    Wu, Yu
    Gong, Yifan
    INTERSPEECH 2022, 2022, : 2608 - 2612
  • [28] Textual Data Selection for Language Modelling in the Scope of Automatic Speech Recognition
    Mezzoudj, Freha
    Langlois, David
    Jouvet, Denis
    Benyettou, Abdelkader
    1ST INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE AND SPEECH PROCESSING, 2018, 128 : 55 - 64
  • [29] Data selection for speech recognition
    Wu, Yi
    Zhang, Rong
    Rudnicky, Alexander
    2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 562 - 565
  • [30] Unsupervised crosslingual adaptation of tokenisers for spoken language recognition
    Ng, Raymond W. M.
    Nicolao, Mauro
    Hain, Thomas
    COMPUTER SPEECH AND LANGUAGE, 2017, 46 : 327 - 342