Shared-Distribution Hidden Markov Models for Speech Recognition

被引：62

作者：

Hwang, Mei-Yuh ^{[1
]}

Huang, Xuedong ^{[1
]}

机构：

[1] Carnegie Mellon Univ, Sch Comp Sci, Pittsburgh, PA 15213 USA

来源：

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 1993年 / 1卷 / 04期

关键词：

Distribution sharing - Generalized triphone model - Hidden Markov models - Parameter sharing - Phonetic model;

D O I：

10.1109/89.242487

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Parameter sharing plays an important role in statistical modeling since the amount of training data is usually limited. On one hand, we would like to use models that are as detailed as possible. On the other hand, with models that are too detailed, we can no longer reliably estimate the parameters. As a parameter-sharing technique, generalized triphones may force two models to be merged when only parts of the models are similar. This problem can be avoided if sharing is carried out at a sub-model level. In this paper, a shared-distribution hidden Markov model is presented for speaker-independent continuous speech recognition. Here, the output distributions across different phonetic HMM's are shared with each other when they exhibit acoustic similarity. This sharing also gives us the freedom to use a larger number of Markov states for each phonetic model. Although an increase in the number of states will increase the total number of free parameters, with distribution sharing we can collapse redundant states while maintaining necessary ones. The shared-distribution model reduced the word error rate on the DARPA Resource Management task by 20% in comparison with the generalized-triphone model.

引用

页码：414 / 420

页数：7

共 28 条

[1]

Abramson N., 1963, INFORM THEORY CODING

[2]

BAHL L, 1987, 13099 RC IBM TJ WATS

[3]

Bahl L. R., 1980, IEEE INT C AC SPEECH

[4] A MAXIMUM-LIKELIHOOD APPROACH TO CONTINUOUS SPEECH RECOGNITION [J].

BAHL, LR ;

JELINEK, F ;

MERCER, RL .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1983, 5 (02) :179-190

[5]

CHOW YL, 1987, ICASSP87, P89

[6]

DORTA P, 1987, IEEE INT C AC SPEECH, P81

[7]

HUANG X, 1990, DARPA SPEECH LANG WO, P327

[8]

HUANG X, 1991, DARPA SPEECH LANG WO

[9]

Huang X. D., 1990, HIDDEN MARKOV MODELS

[10]

HWANG M, 1992, ICASSP

← 1 2 3 →