Aggregate a posteriori linear regression adaptation

被引：11

作者：

Chien, JT ^{[1
]}

Huang, CH ^{[1
]}

机构：

[1] Natl Cheng Kung Univ, Dept Comp Sci & Informat Engn, Tainan 70101, Taiwan

来源：

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2006年 / 14卷 / 03期

关键词：

aggregate a posteriori criterion; Bayesian learning; discriminative adaptation; linear regression adaptation; speaker adaptation; speech recognition;

D O I：

10.1109/TSA.2005.860847

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

We present a new discriminative linear regression adaptation algorithm for hidden Markov model (HMM) based speech recognition. The cluster-dependent regression matrices are estimated from speaker-specific adaptation data through maximizing the aggregate a posteriori probability, which can be expressed in a form of classification error function adopting the logarithm of posterior distribution as the discriminant function. Accordingly, the aggregate a posteriori linear regression (AAPLR) is developed for discriminative adaptation where the classification errors of adaptation data are minimized. Because the prior distribution of regression matrix is involved, AAPLR is geared with the Bayesian learning capability. We demonstrate that the difference between AAPLR discriminative adaptation and maximum a posteriori linear regression (MAPLR) adaptation is due to the treatment of the evidence. Different from minimum classification error linear regression (MCELR), AAPLR has closed-form solution to fulfill rapid adaptation. Experimental results reveal that AAPLR speaker adaptation does improve speech recognition performance with moderate computational cost compared to maximum likelihood linear regression (MLLR), MAPLR, MCELR and conditional maximum likelihood linear regression (CMLLR). These results are verified for supervised adaptation as well as unsupervised adaptation for different numbers of adaptation data.

引用

页码：797 / 807

页数：11

共 34 条

[1]

BAHL L, 1986, P INT C AC SPEECH SI, V1, P49, DOI DOI 10.1109/ICASSP.1986.1169179>

[2] Speaker adaptation using discriminative linear regression on time-varying mean parameters in trended HMM [J].

Chengalvarayan, R .

IEEE SIGNAL PROCESSING LETTERS, 1998, 5 (03) :63-65

[3]

Chesta C., 1999, P EUR C SPEECH COMM, V1, P211

[4] Bayesian learning of speech duration models [J].

Chien, JT ;

Huang, CH .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (06) :558-567

[5] Linear regression based Bayesian predictive classification for speech recognition [J].

Chien, JT .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (01) :70-79

[6] Quasi-Bayes linear regression for sequential learning of hidden Markov models [J].

Chien, JT .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2002, 10 (05) :268-278

[7]

Chou W, 2003, PATTERN RECOGNITION IN SPEECH AND LANGUAGE PROCESSING, P1

[8]

Chou W., 1999, P EUR C SPEECH COMM, P1

[9]

CHOW YL, 1990, INT CONF ACOUST SPEE, P701, DOI 10.1109/ICASSP.1990.115863

[10] MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].

DEMPSTER, AP ;

LAIRD, NM ;

RUBIN, DB .

JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38

← 1 2 3 4 →