Aggregate a posteriori linear regression adaptation

被引:11
作者
Chien, JT [1 ]
Huang, CH [1 ]
机构
[1] Natl Cheng Kung Univ, Dept Comp Sci & Informat Engn, Tainan 70101, Taiwan
来源
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2006年 / 14卷 / 03期
关键词
aggregate a posteriori criterion; Bayesian learning; discriminative adaptation; linear regression adaptation; speaker adaptation; speech recognition;
D O I
10.1109/TSA.2005.860847
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We present a new discriminative linear regression adaptation algorithm for hidden Markov model (HMM) based speech recognition. The cluster-dependent regression matrices are estimated from speaker-specific adaptation data through maximizing the aggregate a posteriori probability, which can be expressed in a form of classification error function adopting the logarithm of posterior distribution as the discriminant function. Accordingly, the aggregate a posteriori linear regression (AAPLR) is developed for discriminative adaptation where the classification errors of adaptation data are minimized. Because the prior distribution of regression matrix is involved, AAPLR is geared with the Bayesian learning capability. We demonstrate that the difference between AAPLR discriminative adaptation and maximum a posteriori linear regression (MAPLR) adaptation is due to the treatment of the evidence. Different from minimum classification error linear regression (MCELR), AAPLR has closed-form solution to fulfill rapid adaptation. Experimental results reveal that AAPLR speaker adaptation does improve speech recognition performance with moderate computational cost compared to maximum likelihood linear regression (MLLR), MAPLR, MCELR and conditional maximum likelihood linear regression (CMLLR). These results are verified for supervised adaptation as well as unsupervised adaptation for different numbers of adaptation data.
引用
收藏
页码:797 / 807
页数:11
相关论文
共 34 条
[1]  
BAHL L, 1986, P INT C AC SPEECH SI, V1, P49, DOI DOI 10.1109/ICASSP.1986.1169179>
[2]   Speaker adaptation using discriminative linear regression on time-varying mean parameters in trended HMM [J].
Chengalvarayan, R .
IEEE SIGNAL PROCESSING LETTERS, 1998, 5 (03) :63-65
[3]  
Chesta C., 1999, P EUR C SPEECH COMM, V1, P211
[4]   Bayesian learning of speech duration models [J].
Chien, JT ;
Huang, CH .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (06) :558-567
[5]   Linear regression based Bayesian predictive classification for speech recognition [J].
Chien, JT .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (01) :70-79
[6]   Quasi-Bayes linear regression for sequential learning of hidden Markov models [J].
Chien, JT .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2002, 10 (05) :268-278
[7]  
Chou W, 2003, PATTERN RECOGNITION IN SPEECH AND LANGUAGE PROCESSING, P1
[8]  
Chou W., 1999, P EUR C SPEECH COMM, P1
[9]  
CHOW YL, 1990, INT CONF ACOUST SPEE, P701, DOI 10.1109/ICASSP.1990.115863
[10]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38