Using boosting to improve a hybrid hmm/neural network speech recognizer

被引:26
|
作者
Schwenk, H [1 ]
机构
[1] CNRS, LIMSI, F-91403 Orsay, France
关键词
D O I
10.1109/ICASSP.1999.759874
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Boosting is a general method for improving the performance of almost any learning algorithm. A recently proposed and very promising boosting algorithm is AdaBoost [7]. In this paper we investigate if AdaBoost can be used to improve a hybrid HMM/neural network continuous speech recognizer. Boosting significantly improves the word error rate From 6.3% to 5.3% on a test set of the OGI Numbers95 corpus, a medium size continuous numbers recognition task. These results compare favorably with other combining techniques using several different feature representations or additional information from longer time spans. Ensemble methods or committees of learning machines can often improve the performance of a system in comparison to a single learning machine. A recently proposed and very promising boosting algorithm is AdaBoost [7]. It constructs a composite classifier by sequentially training classifiers while more and more emphasis on certain patterns. Several authors have reported important improvements with respect to one classifier on several machine learning benchmark problems of the UCI repository, e.g. [2, 6]. These experiments displayed rather intriguing generalization properties, such as continued decrease in generalization error after training error reaches zero. However, most of these data bases are very small (only several hundreds of training examples) and contain no significant amount of noise. There is also recent evidence that AdaBoost may very well overfit if we combine several hundred thousands classifiers [8] and [5] reports severe performance degradations of AdaBoost when adding 20% noise on the class-labels. In summary, we can say that the reasons for the impressive success of AdaBoost are still not completely understood. To the best of our knowledge, an application of AdaBoost to a real world problem has not yet been reported in the literature either. In this paper we investigate if AdaBoost can be applied to boost the performance of a continuous speech recognition system. In this domain we have to deal with large amounts of data (often more than 1 million training examples) and inherently noisy phoneme labels. The paper is organized as follows. In the next two sections we summarize the AdaBoost algorithm and our baseline speech recognizer. In the third section we shown how AdaBoost can be applied to this task and we report results on the Numbers95 corpus and compare them with other classifier combination techniques. The paper finishes with a conclusion and perspectives for future work.
引用
收藏
页码:1009 / 1012
页数:4
相关论文
共 50 条
  • [31] CSELT hybrid HMM/neural networks technology for continuos speech recognition
    Gemello, R
    Albesano, D
    Mana, F
    IJCNN 2000: PROCEEDINGS OF THE IEEE-INNS-ENNS INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOL V, 2000, : 103 - 108
  • [32] Utterance verification of short keywords using hybrid neural-network/HMM approach
    Ou, JZ
    Chen, KJ
    Wang, XP
    Li, ZG
    2001 INTERNATIONAL CONFERENCES ON INFO-TECH AND INFO-NET PROCEEDINGS, CONFERENCE A-G: INFO-TECH & INFO-NET: A KEY TO BETTER LIFE, 2001, : B671 - B676
  • [33] Building a neural speech recognizer for quranic recitations
    Al-Issa S.
    Al-Ayyoub M.
    Al-Khaleel O.
    Elmitwally N.
    International Journal of Speech Technology, 2023, 26 (04) : 1131 - 1151
  • [34] Hybrid Deep Neural Network - Hidden Markov Model (DNN-HMM) Based Speech Emotion Recognition
    Li, Longfei
    Zhao, Yong
    Jiang, Dongmei
    Zhang, Yanning
    Wang, Fengna
    Gonzalez, Isabel
    Valentin, Enescu
    Sahli, Hichem
    2013 HUMAINE ASSOCIATION CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2013, : 312 - 317
  • [35] Experiments on a parametric nonlinear spectral warping for an HMM-based speech recognizer
    Mashao, DJ
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 17 - 20
  • [36] Utterance-level boosting of HMM speech recognizers
    Meyer, G
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 109 - 112
  • [37] Recognition of Chinese speech using hybrid HMM/HNN models
    Jia, Ying
    Du, Limin
    Hou, Ziqiang
    International Conference on Signal Processing Proceedings, ICSP, 1998, 1 : 726 - 729
  • [38] Recognition of Chinese speech using hybrid HMM HNN models
    Jia, Y
    Du, LM
    Hou, ZQ
    ICSP '98: 1998 FOURTH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, PROCEEDINGS, VOLS I AND II, 1998, : 726 - 729
  • [39] Speech/speaker recognition using a HMM/GMM hybrid model
    Rodriguez, E
    Ruiz, B
    Garcia-Crespo, A
    Garcia, F
    AUDIO- AND VIDEO-BASED BIOMETRIC PERSON AUTHENTICATION, 1997, 1206 : 227 - 234
  • [40] A Discriminative Training Method Applied to a Hybrid ANN/HMM Phoneme Recognizer
    Lopes, Carla
    Perdigao, Fernando
    ICSP: 2008 9TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-5, PROCEEDINGS, 2008, : 1982 - 1985