Unsupervised Language Model Adaptation by Data Selection for Speech Recognition

被引:2
|
作者
Khassanov, Yerbolat [1 ]
Chong, Tze Yuang [1 ]
Bigot, Benjamin [1 ]
Chng, Eng Siong [1 ]
机构
[1] Nanyang Technol Univ, Rolls Royce NTU Corp Lab, Singapore, Singapore
基金
新加坡国家研究基金会;
关键词
Language model adaptation; Unsupervised adaptation; Data selection; Speech recognition;
D O I
10.1007/978-3-319-54472-4_48
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present a language model (LM) adaptation framework based on data selection to improve the recognition accuracy of automatic speech recognition systems. Previous approaches of LM adaptation usually require additional data to adapt the existing background LM. In this work, we propose a novel two-pass decoding approach that uses no additional data, but instead, selects relevant data from the existing background corpus that is used to train the background LM. The motivation is that the background corpus consists of data from the different domains and as such, the LM trained from it is generic and not discriminative. To make the LM more discriminative, we will select sentences from the background corpus that are similar in some linguistic characteristics to the utterances recognized in the first-pass and use them to train a new LM which is employed during the second-pass decoding. In this work, we examine the use of n-gram and bag-of-words features as linguistic characteristics of selection criteria. Evaluated on the 11 talks in the test-set of TED-LIUM corpus, the proposed adaptation framework produced a LM that reduced the word error rate by up to 10% relatively and the perplexity by up to 47% relatively. When the LM was adapted for each talk individually, further word error rate reduction was achieved.
引用
收藏
页码:508 / 517
页数:10
相关论文
共 50 条
  • [31] UNSUPERVISED SUBMODULAR SUBSET SELECTION FOR SPEECH DATA
    Wei, Kai
    Liu, Yuzong
    Kirchhoff, Katrin
    Bilmes, Jeff
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [32] Towards Language-universal Mandarin-English Speech Recognition with Unsupervised Label Synchronous Adaptation
    Li, Song
    Luo, Haoneng
    Hu, Wenxuan
    Liu, Yuan
    Zhang, Shiliang
    Li, Lin
    Hong, Qingyang
    2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2022, : 16 - 20
  • [33] Unsupervised Adaptation with Adversarial Dropout Regularization for Robust Speech Recognition
    Guo, Pengcheng
    Sun, Sining
    Xie, Lei
    INTERSPEECH 2019, 2019, : 749 - 753
  • [34] Unsupervised topic adaptation for morph-based speech recognition
    Mansikkaniemi, Andre
    Kurimo, Mikko
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2692 - 2696
  • [35] Unsupervised speaker adaptation for robust speech recognition in real environments
    Yamade, S
    Baba, A
    Yoshikawa, S
    Lee, A
    Saruwatari, H
    Shikano, K
    ELECTRONICS AND COMMUNICATIONS IN JAPAN PART II-ELECTRONICS, 2005, 88 (08): : 30 - 41
  • [36] UNSUPERVISED ADAPTATION WITH DOMAIN SEPARATION NETWORKS FOR ROBUST SPEECH RECOGNITION
    Meng, Zhong
    Chen, Zhuo
    Mazalov, Vadim
    Li, Jinyu
    Gong, Yifan
    2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 214 - 221
  • [37] Unsupervised domain adaptation for speech emotion recognition using PCANet
    Huang, Zhengwei
    Xue, Wentao
    Mao, Qirong
    Zhan, Yongzhao
    MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (05) : 6785 - 6799
  • [38] Unsupervised domain adaptation for speech emotion recognition using PCANet
    Zhengwei Huang
    Wentao Xue
    Qirong Mao
    Yongzhao Zhan
    Multimedia Tools and Applications, 2017, 76 : 6785 - 6799
  • [39] An unsupervised deep domain adaptation approach for robust speech recognition
    Sun, Sining
    Zhang, Binbin
    Xie, Lei
    Zhang, Yanning
    NEUROCOMPUTING, 2017, 257 : 79 - 87
  • [40] Semi-supervised and unsupervised discriminative language model training for automatic speech recognition
    Dikici, Erinc
    Saraclar, Murat
    SPEECH COMMUNICATION, 2016, 83 : 54 - 63