Experiments in speaker normalisation and adaptation for large vocabulary speech recognition

被引:0
|
作者
Pye, D
Woodland, PC
机构
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper examines techniques for speaker normalisation and adaptation that are applied in training with the aim of removing some of the variability from the speaker independent models. Two techniques are examined: vocal tract normalisation (VTN) which estimates a single ''vocal tract length'' parameter for each speaker and then modifies the speech parameterisation accordingly and speaker adaptive training (SAT) which estimates Gaussian mean and variance parameters jointly with a speaker specific set of maximum likelihood linear regression (MLLR) based transformations. It is shown that VTN is effective for both clean speech and mismatched conditions and that the further improvements obtained by applying MLLR in testing are essentially additive. Detailed results from the use of SAT show that worthwhile improvements over using MLLR with standard speaker independent models are obtained.
引用
收藏
页码:1047 / 1050
页数:4
相关论文
共 50 条
  • [1] Speaker adaptation in the philips system for large vocabulary continuous speech recognition
    Thelen, E
    Aubert, X
    Beyerlein, P
    1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1035 - 1038
  • [2] Speaker clustering and transformation for speaker adaptation in large-vocabulary speech recognition systems
    Padmanabhan, M
    Bahl, LR
    Nahamoo, D
    Picheny, MA
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 701 - 704
  • [3] Rapid Nonlinear Speaker Adaptation for Large-Vocabulary Continuous Speech Recognition
    Roupakia, Zoi
    Ragni, Anton
    Gales, Mark
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1782 - 1785
  • [4] MAP Based Speaker Adaptation in Very Large Vocabulary Speech Recognition of Czech
    Cerva, Petr
    Nouza, Jan
    RADIOENGINEERING, 2004, 13 (03) : 42 - 46
  • [5] Supervised and unsupervised speaker adaptation in large vocabulary continuous speech recognition of Czech
    Cerva, P
    Nouza, J
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2005, 3658 : 203 - 210
  • [6] Large Vocabulary Speech Recognition: Speaker Dependent and Speaker Independent
    Hemakumar, G.
    Punitha, P.
    INFORMATION SYSTEMS DESIGN AND INTELLIGENT APPLICATIONS, VOL 1, 2015, 339 : 73 - 80
  • [7] SPEAKER ADAPTATION BY VARIABLE REFERENCE MODEL SUBSPACE AND APPLICATION TO LARGE VOCABULARY SPEECH RECOGNITION
    Teng, Wen Xuan
    Gravier, Guillaume
    Bimbot, Frederic
    Soufflet, Frederic
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4381 - 4384
  • [8] Probabilistic Latent Speaker Analysis for Large Vocabulary Speech Recognition
    Su, Dan
    Wu, Xihong
    Chi, Huisheng
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1889 - 1892
  • [9] Probabilistic Latent Speaker Training for Large Vocabulary Speech Recognition
    Su, Dan
    Wu, Xihong
    Chi, Huisheng
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1225 - 1228
  • [10] Speaker verification through large vocabulary continuous speech recognition
    Newman, M
    Gillick, L
    Ito, Y
    McAllaster, D
    Peskin, B
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2419 - 2422