Discriminative optimisation of large vocabulary recognition systems

被引:0
|
作者
Valtchev, V
Woodland, PC
Young, SJ
机构
来源
ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4 | 1996年
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper describes a framework for optimising the structure and parameters of a continuous density HMM-based large vocabulary recognition system using the Maximum Mutual Information Estimation (MMIE) criterion. To reduce the computational complexity of the MMIE training algorithm, confusable segments of speech are identified and stored as word lattices of alternative utterance hypotheses. An iterative mixture splitting procedure is also employed to adjust the number of mixture components in each state during training such that the optimal balance between number of parameters and available Paining data is achieved. Experiments are presented on various test sets from the Wall Street Journal database using the full SI-284 training set. These show that the use of lattices makes MMIE training practicable for very complex recognition systems and large Paining sets. Furthermore, experimental results demonstrate that MMIE optimisation of system structure and parameters can yield useful increases in recognition accuracy.
引用
收藏
页码:18 / 21
页数:4
相关论文
共 50 条
  • [1] Discriminative training of Gaussian mixture models for large vocabulary speech recognition systems
    Bahl, LR
    Padmanabhan, M
    Nahamoo, D
    Gopalakrishnan, PS
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 613 - 616
  • [2] Lattice-based discriminative training for large vocabulary speech recognition
    Valtchev, V
    Odell, JJ
    Woodland, PC
    Young, SJ
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 605 - 608
  • [3] Discriminative training of decoding graphs for large vocabulary continuous speech recognition
    Kuo, Hong-Kwang Jeff
    Kingsbury, Brian
    Zweig, Geoffrey
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 45 - +
  • [4] Improved discriminative training techniques for large vocabulary continuous speech recognition
    Povey, D
    Woodland, PC
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 45 - 48
  • [5] Discriminative training for large vocabulary telephone-based name recognition
    McDermott, E
    Biem, A
    Tenpaku, S
    Katagiri, S
    2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 3739 - 3742
  • [6] DISCRIMINATIVE TRAINING OF HIERARCHICAL ACOUSTIC MODELS FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION
    Chang, Hung-An
    Glass, James R.
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4481 - 4484
  • [7] MMIE training of large vocabulary recognition systems
    Valtchev, V
    Odell, JJ
    Woodland, PC
    Young, SJ
    SPEECH COMMUNICATION, 1997, 22 (04) : 303 - 314
  • [8] SPEECH RECOGNITION FOR LARGE-VOCABULARY SYSTEMS
    JACOB, B
    ANDREOBRECHT, R
    JOURNAL DE PHYSIQUE IV, 1994, 4 (C5): : 489 - 492
  • [9] COMBINING DISCRIMINATIVE FEATURE, TRANSFORM, AND MODEL TRAINING FOR LARGE VOCABULARY SPEECH RECOGNITION
    Zheng, Jing
    Cetin, Ozgur
    Hwang, Mei-Yuh
    Lei, Xin
    Stolcke, Andreas
    Morgan, Nelson
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 633 - +
  • [10] A dynamic in-search discriminative training approach for large vocabulary speech recognition
    Jiang, H
    Siohan, O
    Soong, FK
    Lee, CH
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 113 - 116