Speech Recognition Using Hidden Markov Models with Polynomial Regression Functions as Nonstationary States

被引：84

作者：

Deng, Li ^{[1
]}

Aksmanovic, Mike ^{[1
]}

Sun, Xiaodong ^{[2
]}

Wu, C. F. Jeff ^{[2
]}

机构：

[1] Univ Waterloo, Dept Elect & Comp Engn, Waterloo, ON N2L 3G1, Canada

[2] Univ Waterloo, Dept Stat & Actuarial Sci, Waterloo, ON N2L 3G1, Canada

来源：

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 1994年 / 2卷 / 04期

基金：

加拿大自然科学与工程研究理事会;

关键词：

11;

D O I：

10.1109/89.326610

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

We propose, implement, and evaluate a class of nonstationary-state hidden Markov models (HMM's) having each state associated with a distinct polynomial regression function of time plus white Gaussian noise. The model represents the transitional acoustic trajectories of speech in a parametric manner, and includes the standard stationary-state HMM as a special, degenerated case. We develop an efficient dynamic programming technique which includes the state sojourn time as an optimization variable, in conjunction with a state-dependent orthogonal polynomial regression method, for estimating the model parameters. Experiments on fitting models to speech data and on limited-vocabulary speech recognition demonstrate consistent superiority of these nonstationary-state HMM's over the traditional stationary-state HMM's.

引用

页码：507 / 520

页数：14

共 11 条

[1]

Baum L. E., 1972, INEQUALITIES, V3, P1

[2] A GENERALIZED HIDDEN MARKOV MODEL WITH STATE-CONDITIONED TREND FUNCTIONS OF TIME FOR THE SPEECH SIGNAL [J].

DENG, L .

SIGNAL PROCESSING, 1992, 27 (01) :65-78

[3]

Deng L., 1991, Neural Networks for Signal Processing. Proceedings of the 1991 IEEE Workshop (Cat. No.91TH0385-5), P411, DOI 10.1109/NNSP.1991.239500

[4] PHONEMIC HIDDEN MARKOV-MODELS WITH CONTINUOUS MIXTURE OUTPUT DENSITIES FOR LARGE VOCABULARY WORD RECOGNITION [J].

DENG, L ;

KENNY, P ;

LENNIG, M ;

GUPTA, V ;

SEITZ, F ;

MERMELSTEIN, P .

IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1991, 39 (07) :1677-1681

[5] TUTORIAL ON THE SWEEP OPERATOR [J].

GOODNIGHT, JH .

AMERICAN STATISTICIAN, 1979, 33 (03) :149-158

[6] THE SEGMENTAL K-MEANS ALGORITHM FOR ESTIMATING PARAMETERS OF HIDDEN MARKOV-MODELS [J].

JUANG, BH ;

RABINER, LR .

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1990, 38 (09) :1639-1641

[7] MAXIMUM-LIKELIHOOD ESTIMATION FOR MULTIVARIATE OBSERVATIONS OF MARKOV SOURCES [J].

LIPORACE, LA .

IEEE TRANSACTIONS ON INFORMATION THEORY, 1982, 28 (05) :729-734

[8] A STOCHASTIC SEGMENT MODEL FOR PHONEME-BASED CONTINUOUS SPEECH RECOGNITION [J].

OSTENDORF, M ;

ROUKOS, S .

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1989, 37 (12) :1857-1869

[9]

PORITZ AB, 1989, P IEEE INT C AC SPEE, P7

[10] A TUTORIAL ON HIDDEN MARKOV-MODELS AND SELECTED APPLICATIONS IN SPEECH RECOGNITION [J].

RABINER, LR .

PROCEEDINGS OF THE IEEE, 1989, 77 (02) :257-286

← 1 2 →