Unsupervised Language Model Adaptation by Data Selection for Speech Recognition

被引:2
|
作者
Khassanov, Yerbolat [1 ]
Chong, Tze Yuang [1 ]
Bigot, Benjamin [1 ]
Chng, Eng Siong [1 ]
机构
[1] Nanyang Technol Univ, Rolls Royce NTU Corp Lab, Singapore, Singapore
基金
新加坡国家研究基金会;
关键词
Language model adaptation; Unsupervised adaptation; Data selection; Speech recognition;
D O I
10.1007/978-3-319-54472-4_48
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present a language model (LM) adaptation framework based on data selection to improve the recognition accuracy of automatic speech recognition systems. Previous approaches of LM adaptation usually require additional data to adapt the existing background LM. In this work, we propose a novel two-pass decoding approach that uses no additional data, but instead, selects relevant data from the existing background corpus that is used to train the background LM. The motivation is that the background corpus consists of data from the different domains and as such, the LM trained from it is generic and not discriminative. To make the LM more discriminative, we will select sentences from the background corpus that are similar in some linguistic characteristics to the utterances recognized in the first-pass and use them to train a new LM which is employed during the second-pass decoding. In this work, we examine the use of n-gram and bag-of-words features as linguistic characteristics of selection criteria. Evaluated on the 11 talks in the test-set of TED-LIUM corpus, the proposed adaptation framework produced a LM that reduced the word error rate by up to 10% relatively and the perplexity by up to 47% relatively. When the LM was adapted for each talk individually, further word error rate reduction was achieved.
引用
收藏
页码:508 / 517
页数:10
相关论文
共 50 条
  • [41] Optimized data selection strategy based unsupervised acoustic modeling for low data resource speech recognition
    Qian, Yanmin
    Liu, Jia
    Qinghua Daxue Xuebao/Journal of Tsinghua University, 2013, 53 (07): : 1001 - 1004
  • [42] UNSUPERVISED CV LANGUAGE MODEL ADAPTATION BASED ON DIRECT LIKELIHOOD MAXIMIZATION SENTENCE SELECTION
    Shinozaki, Takahiro
    Horiuchi, Yasuo
    Kuroiwa, Shingo
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 5029 - 5032
  • [43] NORMALIZATION AND ADAPTATION OF SPEECH DATA FOR AUTOMATIC SPEECH RECOGNITION
    SCARR, RWA
    INTERNATIONAL JOURNAL OF MAN-MACHINE STUDIES, 1970, 2 (01): : 41 - 59
  • [44] Just-in-time latent semantic adaptation on language model for Chinese speech recognition using web data
    Gao, Qin
    Lin, Xiaojun
    Wu, Xihong
    2006 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, 2006, : 50 - +
  • [45] Dynamic Language Model Adaptation Using Presentation Slides for Lecture Speech Recognition
    Yamazaki, Hiroki
    Iwano, Koji
    Shinoda, Koichi
    Furui, Sadaoki
    Yokota, Haruo
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 89 - 92
  • [46] TOPIC N-GRAM COUNT LANGUAGE MODEL ADAPTATION FOR SPEECH RECOGNITION
    Haidar, Md. Akmal
    O'Shaughnessy, Douglas
    2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 165 - 169
  • [47] LSTM LANGUAGE MODEL ADAPTATION WITH IMAGES AND TITLES FOR MULTIMEDIA AUTOMATIC SPEECH RECOGNITION
    Moriya, Yasufumi
    Jones, Gareth. J. F.
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 219 - 226
  • [48] Style-specific language model adaptation for Korean conversational speech recognition
    Park, YH
    Chung, MW
    2003 INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, PROCEEDINGS, 2003, : 591 - 596
  • [49] Unsupervised language model adaptation for broadcast news
    Chen, LZ
    Gauvain, JL
    Lamel, L
    Adda, G
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 220 - 223
  • [50] Tree-structured model selection and simulated-data adaptation for environmental and speaker robust speech recognition
    Thatphithakkul, Nattanun
    Kruatrachue, Boontee
    Wutiwiwatchai, Chai
    Marukatat, Sanparith
    Boonpiam, Vataya
    2007 INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES, VOLS 1-3, 2007, : 1570 - +