Speaker clustering for speech recognition using vocal tract parameters

被引：10

作者：

Naito, M

Deng, L

Sagisaka, Y

机构：

[1] ATR, Interpreting Telephony Res Labs, Kyoto 6190288, Japan

[2] Univ Waterloo, Dept Elect & Comp Engn, Waterloo, ON N2L 3G1, Canada

来源：

SPEECH COMMUNICATION | 2002年 / 36卷 / 3-4期

关键词：

vocal tract parameters; speaker-clustering; speech recognition;

D O I：

10.1016/S0167-6393(00)00089-3

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

We propose speaker clustering methods for speech recogition based on vocal tract (VT) size related articulatory parameters associated with individual speakers. Two parameters characterizing gross VT dimensions are first derived from the formant frequencies of two vowels and are then used to cluster speakers. The resulting speaker clusters are significantly different from speaker clusters obtained by conventional acoustic criteria. Then phoneme recognition experiments are carried out by using speaker-clustered HMMs (SC-HMMs) trained for each cluster. The proposed method requires a small amount of speech data for speaker clustering and for selecting the most suitable SC-HMM for a target speaker, but gives higher recognition rates than conventional speaker clustering methods based on acoustic criteria. (C) 2002 Elsevier Science B.V. All rights reserved.

引用

页码：305 / 315

页数：11