Speaker adaptation for telephony data using speaker clustering

被引:0
作者
Wu, C [1 ]
Lubensky, D [1 ]
Wang, ZH [1 ]
机构
[1] IBM Corp, Thomas J Watson Res Ctr, Yorktown Heights, NY 10598 USA
来源
2000 5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I-III | 2000年
关键词
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper reports an ongoing effort to develop an unsupervised an-line speaker adaptation method for telephony environment. All speakers in the training data corpus are acoustically pre-clustered into clusters, and a cluster-dependent system is built for each duster. When a new telephony test speaker is given, a cluster, which is the closest to the speaker, is determined and selected by an improved distance measure. Based on this selected cluster, MLLR adaptation algorithm with block diagonal transformation is applied to move the cluster model to be closer to the testing speaker. For telephony application the adaptation data can be very short or noisy, potentially, the MLLR adapted means can be unreliable. A MAP-like weighting scheme for MLLR adaptation is applied to insure the adapted mean reliable when the adaptation data is very short.
引用
收藏
页码:768 / 771
页数:4
相关论文
共 7 条
[1]  
DAS S, EUROSPEECH 99, P1959
[2]  
GAO YQ, EUROSPEECH 97, P2091
[3]  
Gauvain J.-L., 1994, IEEE T SPEECH AUDIO, V2
[4]  
KOMPE R, EUROSPEECH 99, P5
[5]   MAXIMUM-LIKELIHOOD LINEAR-REGRESSION FOR SPEAKER ADAPTATION OF CONTINUOUS DENSITY HIDDEN MARKOV-MODELS [J].
LEGGETTER, CJ ;
WOODLAND, PC .
COMPUTER SPEECH AND LANGUAGE, 1995, 9 (02) :171-185
[6]  
PADMANABHAN M, ICASSP 96
[7]  
WANG ZH, ICASSP 98, P256