Adaptation of hidden Markov models using maximum model distance algorithm

被引:3
作者
He, QH [1 ]
Kwong, S
Hong, QY
机构
[1] S China Univ Technol, Guangzhou 510641, Peoples R China
[2] City Univ Hong Kong, Hong Kong, Hong Kong, Peoples R China
来源
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS | 2004年 / 34卷 / 02期
关键词
hidden Markov model; maximum model distance; speaker adaptation;
D O I
10.1109/TSMCA.2003.818884
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a new approach that uses the maximum model distance (MMD) method for the adaptation of Hidden Markov models (HMMs). This method has the same framework as it is used for constructing speech recognizers with abundant data, and work effectively with any amount of adaptation data. All parameters of the HMMs with or without the adaptation data could be adapted. If the adaptation data is sufficient, then the adapted models will gradually become a speaker-dependent one. Both the dialect and the speaker adaptation experiments were conducted to investigate the effectiveness of the proposed algorithm. In the speaker adaptation experiments, up to 65.55% phoneme error reduction was achieved, and the MMD could reduce the phoneme error by 16.91% even only one adaptation utterance is available.
引用
收藏
页码:270 / 276
页数:7
相关论文
共 22 条
[1]  
BAGGENSTOSS PM, 2000, INT J SPEECH COMMUN, P411
[2]  
BAHL R, 1986, P 1986 IEEE INT C AC, P49
[3]   A hybrid algorithm for speaker adaptation using MAP transformation and adaptation [J].
Chien, JT ;
Lee, CH ;
Wang, HC .
IEEE SIGNAL PROCESSING LETTERS, 1997, 4 (06) :167-169
[4]   Maximum-likelihood stochastic-transformation adaptation of hidden Markov models [J].
Diakoloukas, VD ;
Digalakis, VV .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1999, 7 (02) :177-187
[5]   Speaker adaptation using combined transformation and Bayesian methods [J].
Digalakis, VV ;
Neumeyer, LG .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1996, 4 (04) :294-300
[6]   SPEAKER ADAPTATION USING CONSTRAINED ESTIMATION OF GAUSSIAN MIXTURES [J].
DIGALAKIS, VV ;
RTISCHEV, D ;
NEUMEYER, LG .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1995, 3 (05) :357-366
[7]   Maximum a Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains [J].
Gauvain, Jean-Luc ;
Lee, Chin-Hui .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (02) :291-298
[8]   Speech Recognition Using Speaker Adaptation by System Parameter Transformation [J].
Hao, Ying ;
Fang, Ditang .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (01) :63-68
[9]   An improved maximum model distance approach for HMM-based speech recognition systems [J].
He, QH ;
Kwong, S ;
Man, KF ;
Tang, KS .
PATTERN RECOGNITION, 2000, 33 (10) :1749-1758
[10]  
Huo Q, 1997, IEEE T SPEECH AUDI P, V5, P161, DOI 10.1109/89.554778