Speaker clustering for speech recognition using vocal tract parameters

被引:10
作者
Naito, M
Deng, L
Sagisaka, Y
机构
[1] ATR, Interpreting Telephony Res Labs, Kyoto 6190288, Japan
[2] Univ Waterloo, Dept Elect & Comp Engn, Waterloo, ON N2L 3G1, Canada
关键词
vocal tract parameters; speaker-clustering; speech recognition;
D O I
10.1016/S0167-6393(00)00089-3
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We propose speaker clustering methods for speech recogition based on vocal tract (VT) size related articulatory parameters associated with individual speakers. Two parameters characterizing gross VT dimensions are first derived from the formant frequencies of two vowels and are then used to cluster speakers. The resulting speaker clusters are significantly different from speaker clusters obtained by conventional acoustic criteria. Then phoneme recognition experiments are carried out by using speaker-clustered HMMs (SC-HMMs) trained for each cluster. The proposed method requires a small amount of speech data for speaker clustering and for selecting the most suitable SC-HMM for a target speaker, but gives higher recognition rates than conventional speaker clustering methods based on acoustic criteria. (C) 2002 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:305 / 315
页数:11
相关论文
共 8 条
[1]  
FLANAGAN JL, 1983, SPEECH ANAL SYNTHESI
[2]  
GALVAN A, 1997, THESIS I NATL POLYTE
[3]  
GALVAN A, 1998, 9811 UWECE
[4]  
KOSAKA T, 1994, P ICASSP, P245
[5]   HMM topology design using maximum likelihood successive state splitting [J].
Ostendorf, M ;
Singer, H .
COMPUTER SPEECH AND LANGUAGE, 1997, 11 (01) :17-41
[6]  
SUGAMURA N, 1983, P ICASSP 83, P243
[7]  
TAKEZAWA T, 1998, P 1 INT WORKSH E AS, P148
[8]  
TONOMURA M, 1995, P ICASSP 95, P688