Model-based margin estimation for hidden Markov model learning and generalisation

被引:1
作者
Siniscalchi, Sabato Marco [1 ]
Li, Jinyu [2 ]
Lee, Chin-Hui [3 ]
机构
[1] Kore Univ Enna, Fac Engn & Architecture, Enna, Sicily, Italy
[2] Microsoft Corp, Redmond, WA 98052 USA
[3] Georgia Inst Technol, Sch Elect & Comp Engn, Atlanta, GA 30332 USA
关键词
estimation theory; Gaussian processes; hidden Markov models; learning (artificial intelligence); parameter estimation; pattern classification; speech recognition; support vector machines; model-based margin estimation; hidden Markov model learning; generalisation; speech scientist; margin-based classifier; HMM; continuous-density hidden Markov estimation model; automatic speech recognition; ASR; maximum classification margin; learning support vector machine; soft margin estimation framework; standard distance-based margin; state Gaussian mixture model density; SPEECH RECOGNITION; PATTERN-RECOGNITION;
D O I
10.1049/iet-spr.2013.0036
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Recently, speech scientists have been motivated by the great, success of building margin-based classifiers, and have thus proposed novel methods to estimate continuous-density hidden Markov model (HMM) for automatic speech recognition (ASR) according to the notion that the decision boundaries determined by the estimated HMMs attain the maximum classification margin as in learning support vector machines. Although a good performance has been observed, the margin used in the ASR community is often specified as a parameter that has no explicit relationship with the HMM parameters. The issues of how the margin is related to the HMM parameters and how it directly characterises the generalisation capability of HMM-based classifiers have not been addressed so far in the community. In this study, the authors attempt to formulate the margin used in the soft margin estimation framework as a function of the HMM parameters. The key idea is to relate the standard distance-based margin with the concept of divergence among competing HMM state Gaussian mixture model densities. Experimental results show that the proposed model-based margin function is a good indication about the quality of HMMs on a given ASR task without the conventional needs of running experiments extensively using a separate set of test samples.
引用
收藏
页码:704 / 709
页数:6
相关论文
共 10 条
[1]  
[Anonymous], 1968, INFORM THEORY STAT
[2]   A tutorial on Support Vector Machines for pattern recognition [J].
Burges, CJC .
DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 2 (02) :121-167
[3]  
Fukunaga K, 1990, INTRO STAT PATTERN R, V2nd
[4]   A study on model-based error rate estimation for automatic speech recognition [J].
Huang, CS ;
Wang, HC ;
Lee, CH .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (06) :581-589
[5]   Large margin hidden Markov models for speech recognition [J].
Jiang, Hui ;
Li, Xinwei ;
Liu, Chaojun .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (05) :1584-1595
[6]   Minimum classification error rate methods for speech recognition [J].
Juang, BH ;
Chou, W ;
Lee, CH .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1997, 5 (03) :257-265
[7]   Pattern recognition using a family of design algorithms based upon the generalized probabilistic descent method [J].
Katagiri, S ;
Juang, BH ;
Lee, CH .
PROCEEDINGS OF THE IEEE, 1998, 86 (11) :2345-2373
[8]   Approximate test risk bound minimization through soft margin estimation [J].
Li, Jinyu ;
Yuan, Ming ;
Lee, Chin-Hui .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (08) :2393-2404
[9]  
Sha Fei, 2007, in Advances in Neural Information Processing Systems, V19, P1249
[10]   A Study on the Generalization Capability of Acoustic Models for Robust Speech Recognition [J].
Xiao, Xiong ;
Li, Jinyu ;
Chng, Eng Siong ;
Li, Haizhou ;
Lee, Chin-Hui .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (06) :1158-1169