Combined Bayesian and predictive techniques for rapid speaker adaptation of continuous density hidden Markov models

被引：34

作者：

Ahadi, SM

Woodland, PC

机构：

[1] Department of Engineering, University of Cambridge, Cambridge CB2 1PZ, Trumpington Street

来源：

COMPUTER SPEECH AND LANGUAGE | 1997年 / 11卷 / 03期

关键词：

D O I：

10.1006/csla.1997.0031

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

One problem faced by some model adaptation techniques is that only the parameters of those models which are observed in the adaptation data are updated. Hence, with small amounts of adaptation data most of the system parameters remain unchanged. In this paper, a technique called regression-based model prediction (RMP), which tries to overcome this problem, is presented. This technique tries to adapt the model parameters of a continuous density hidden Markov model set which has insufficient adaptation data when used with maximum a posteriori (MAP) estimation. The technique uses the parameters of better estimated models and a set of parameter relationships between the model parameters to update the parameters of models with insufficient adaptation data. The parameter relationships are found by applying linear regression to a number of speaker-specific model sets. Experiments using both MAP estimation and RMP are presented using the ARPA RM1 continuous speech database and RMP has been found to be useful in improving the system performance with as little as 3 s of adaptation speech. RMP has been shown to consistently improve the results obtained by MAP. When a very large number of adaptation sentences are used the error rates converge towards those of MAP. It is shown that RMP gives an improvement of 8.8% over the baseline error rate with a single adaptation sentence, and 27% with 40 adaptation sentences. (C) 1997 Academic Press Limited.

引用

页码：187 / 206

页数：20

共 20 条

[1]

AHADI SM, 1995, P IEEE INT C AC SPEE, V1, P684

[2]

AHADI SM, 1996, THESIS CAMBRIDGE U E

[3]

[Anonymous], 1989, Automatic speech recognition: The development of the SPHINX system

[4]

Chatterjee S., 1991, REGRESSION ANAL EXAM

[5] PREDICTIVE SPEAKER ADAPTATION IN SPEECH RECOGNITION [J].

COX, S .

COMPUTER SPEECH AND LANGUAGE, 1995, 9 (01) :1-17

[6] SPEAKER ADAPTATION IN SPEECH RECOGNITION USING LINEAR-REGRESSION TECHNIQUES [J].

COX, S .

ELECTRONICS LETTERS, 1992, 28 (22) :2093-2094

[7]

COX SJ, 1993, P EUR, V3, P2283

[8]

Duda R. O., 1973, PATTERN CLASSIFICATI, V3

[9]

DUNN OJ, 1987, APPLIED STATISTICS A

[10] A TRAINING PROCEDURE FOR ISOLATED WORD RECOGNITION SYSTEMS [J].

FURUI, S .

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1980, 28 (02) :129-136

← 1 2 →