Structural maximum a posteriori linear regression for fast HMM adaptation

被引：59

作者：

Siohan, O ^{[1
]}

Myrvoll, TA ^{[1
]}

Lee, CH ^{[1
]}

机构：

[1] Bell Labs, Lucent Technol, Multimedia Commun Res Lab, Murray Hill, NJ 07974 USA

来源：

COMPUTER SPEECH AND LANGUAGE | 2002年 / 16卷 / 01期

关键词：

D O I：

10.1006/csla.2001.0181

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Transformation-based model adaptation techniques have been used for many years to improve robustness of speech recognition systems. While the estimation criterion used to estimate transformation parameters has been mainly based on maximum likelihood estimation (MLE), Bayesian versions of some of the most popular transformation-based adaptation methods have been recently introduced, like MAPLR, a maximum a posteriori (MAP) based version of the well-known maximum likelihood linear regression (MLLR) algorithm. This is in fact an attempt to constraint parameter estimation in order to obtain reliable estimation with a limited amount of data, not only to prevent overfitting the adaptation data but also to allow integration of prior knowledge into transformation-based adaptation techniques. Since such techniques require the estimation of a large number of transformation parameters when the amount of adaptation data is large, it is also required to define a large number of prior densities for these parameters. Robust estimation of these prior densities is therefore a crucial issue that directly affects the efficiency and effectiveness of the Bayesian techniques. This paper proposes to estimate these priors using the notion of hierarchical priors, embedded into the tree structure used to control transformation complexity. The proposed algorithm, called structural MAPLR (SMAPLR), has been evaluated on the Spoke3 1993 test set of the WSJ task. It is shown that SMAPLR reduces the risk of overtraining and exploits the adaptation data much more efficiently than MLLR, leading to a significant reduction of the word error rate for any amount of adaptation data. (C) 2002 Academic Press.

引用

页码：5 / 24

页数：20

共 38 条

[1]

CHEN KT, 2001, P IEEE INT C AC SPEE

[2]

Chesta C., 1999, P EUR C SPEECH COMM, V1, P211

[3] A hybrid algorithm for speaker adaptation using MAP transformation and adaptation [J].

Chien, JT ;

Lee, CH ;

Wang, HC .

IEEE SIGNAL PROCESSING LETTERS, 1997, 4 (06) :167-169

[4]

CHIEN JT, 1996, P INT C SPOK LANG PR

[5]

CHIEN JT, 1997, P IEEE INT C AC SPEE

[6]

CHIEN JT, 1997, P EUR C SPEECH COMM

[7] MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].

DEMPSTER, AP ;

LAIRD, NM ;

RUBIN, DB .

JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38

[8] Maximum-likelihood stochastic-transformation adaptation of hidden Markov models [J].

Diakoloukas, VD ;

Digalakis, VV .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1999, 7 (02) :177-187

[9] Speaker adaptation using combined transformation and Bayesian methods [J].

Digalakis, VV ;

Neumeyer, LG .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1996, 4 (04) :294-300

[10]

ERDOGAN H, 2001, P IEEE INT C AC SPEE

← 1 2 3 4 →