Speaker selection training for large vocabulary continuous speech recognition

被引:0
|
作者
Huang, C [1 ]
Chen, T [1 ]
Chang, E [1 ]
机构
[1] Microsoft Res Asia, Beijing 100080, Peoples R China
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Acoustic variability across speakers is one of the challenges of speaker independent (SI) speech recognition systems, As a powerful solution, dominant speaker adaptation technologies such as MLLR and MAP may become inefficient because of the lack of enough enrollment data. In this paper, we propose an adaptation method based on speaker selection training, which makes full use of statistics of training corpus. Relative error rate reduction of 5.31% is achieved when only one utterance is available. We compare different speaker selection strategies, namely, PCA, HMM and GMM based methods. In addition, impacts of number of selected cohort speakers and number of utterances from target speaker are investigated. Furthermore, comparison and integration with MLLR adaptation are also shown. Finally, some ongoing work such as dynamically varying number of selected speakers, measuring the relative contribution among the selected speakers and speeding up the computationally expensive procedure of re-estimation with model synthesis are also discussed.
引用
收藏
页码:609 / 612
页数:4
相关论文
共 50 条
  • [21] Eigenspace-based MLLR with speaker adaptive training in large vocabulary conversational speech recognition
    Doumpiotis, V
    Deng, YG
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 357 - 360
  • [22] Speaker clustering and transformation for speaker adaptation in large-vocabulary speech recognition systems
    Padmanabhan, M
    Bahl, LR
    Nahamoo, D
    Picheny, MA
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 701 - 704
  • [23] Improving Discriminative Training for Robust Acoustic Models in Large Vocabulary Continuous Speech Recognition
    Pylkkonen, Janne
    Kurimo, Mikko
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1210 - 1213
  • [24] Training of across-word phoneme models for large vocabulary continuous speech recognition
    Sixtus, A
    Ney, H
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 849 - 852
  • [25] Developments in large vocabulary, continuous speech recognition of German
    AddaDecker, M
    Adda, G
    Lamel, L
    Gauvain, JL
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 153 - 156
  • [26] Utilizing Lipreading in Large Vocabulary Continuous Speech Recognition
    Palecek, Karel
    SPEECH AND COMPUTER, SPECOM 2017, 2017, 10458 : 767 - 776
  • [27] The RWTH large vocabulary continuous speech recognition system
    Ney, H
    Welling, L
    Ortmanns, S
    Beulen, K
    Wessel, F
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 853 - 856
  • [28] Combating Reverberation in Large Vocabulary Continuous Speech Recognition
    Mitra, Vikramjit
    Van Hout, Julien
    McLaren, Mitchell
    Wang, Wen
    Graciarena, Martin
    Vergyri, Dimitra
    Franco, Horacio
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2449 - 2453
  • [29] Accent Issues in Large Vocabulary Continuous Speech Recognition
    Chao Huang
    Tao Chen
    Eric Chang
    International Journal of Speech Technology, 2004, 7 (2-3) : 141 - 153
  • [30] Experimenting with lipreading for large vocabulary continuous speech recognition
    Palecek, Karel
    JOURNAL ON MULTIMODAL USER INTERFACES, 2018, 12 (04) : 309 - 318