Speech Recognition Using Hidden Markov Models with Polynomial Regression Functions as Nonstationary States

被引:84
作者
Deng, Li [1 ]
Aksmanovic, Mike [1 ]
Sun, Xiaodong [2 ]
Wu, C. F. Jeff [2 ]
机构
[1] Univ Waterloo, Dept Elect & Comp Engn, Waterloo, ON N2L 3G1, Canada
[2] Univ Waterloo, Dept Stat & Actuarial Sci, Waterloo, ON N2L 3G1, Canada
来源
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 1994年 / 2卷 / 04期
基金
加拿大自然科学与工程研究理事会;
关键词
11;
D O I
10.1109/89.326610
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We propose, implement, and evaluate a class of nonstationary-state hidden Markov models (HMM's) having each state associated with a distinct polynomial regression function of time plus white Gaussian noise. The model represents the transitional acoustic trajectories of speech in a parametric manner, and includes the standard stationary-state HMM as a special, degenerated case. We develop an efficient dynamic programming technique which includes the state sojourn time as an optimization variable, in conjunction with a state-dependent orthogonal polynomial regression method, for estimating the model parameters. Experiments on fitting models to speech data and on limited-vocabulary speech recognition demonstrate consistent superiority of these nonstationary-state HMM's over the traditional stationary-state HMM's.
引用
收藏
页码:507 / 520
页数:14
相关论文
共 11 条
[1]  
Baum L. E., 1972, INEQUALITIES, V3, P1
[2]   A GENERALIZED HIDDEN MARKOV MODEL WITH STATE-CONDITIONED TREND FUNCTIONS OF TIME FOR THE SPEECH SIGNAL [J].
DENG, L .
SIGNAL PROCESSING, 1992, 27 (01) :65-78
[3]  
Deng L., 1991, Neural Networks for Signal Processing. Proceedings of the 1991 IEEE Workshop (Cat. No.91TH0385-5), P411, DOI 10.1109/NNSP.1991.239500
[4]   PHONEMIC HIDDEN MARKOV-MODELS WITH CONTINUOUS MIXTURE OUTPUT DENSITIES FOR LARGE VOCABULARY WORD RECOGNITION [J].
DENG, L ;
KENNY, P ;
LENNIG, M ;
GUPTA, V ;
SEITZ, F ;
MERMELSTEIN, P .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1991, 39 (07) :1677-1681
[5]   TUTORIAL ON THE SWEEP OPERATOR [J].
GOODNIGHT, JH .
AMERICAN STATISTICIAN, 1979, 33 (03) :149-158
[6]   THE SEGMENTAL K-MEANS ALGORITHM FOR ESTIMATING PARAMETERS OF HIDDEN MARKOV-MODELS [J].
JUANG, BH ;
RABINER, LR .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1990, 38 (09) :1639-1641
[8]   A STOCHASTIC SEGMENT MODEL FOR PHONEME-BASED CONTINUOUS SPEECH RECOGNITION [J].
OSTENDORF, M ;
ROUKOS, S .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1989, 37 (12) :1857-1869
[9]  
PORITZ AB, 1989, P IEEE INT C AC SPEE, P7
[10]   A TUTORIAL ON HIDDEN MARKOV-MODELS AND SELECTED APPLICATIONS IN SPEECH RECOGNITION [J].
RABINER, LR .
PROCEEDINGS OF THE IEEE, 1989, 77 (02) :257-286