UBM based speaker selection and model re-estimation for speaker adaptation

被引:0
作者
Wang, Jian [1 ]
Guo, Jun [1 ]
Liu, Gang [1 ]
Lei, Jianjun [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Informat Engn, Wangjian 200810, Peoples R China
来源
PROCEEDINGS OF THE FIFTH IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS, VOLS 1 AND 2 | 2006年
基金
中国国家自然科学基金;
关键词
speaker adaptation; speaker selection; UBM;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Based on speaker selection, speaker adaptation technology can get a promising performance. In such system, how to represent a speaker and the computation of selection are still big issues. In this paper, we take Gaussian mixture model (GMM) as representation of a speaker, which adapted from universal background model (UBM). Likelihood ratio (LR) and cross likelihood ratio (CLR) are utilized for speaker selection. Furthermore, a single-pass re-estimation procedure, conditioned on the speaker-independent model is shown. This adaptation strategy was evaluated in a large vocabulary speech recognition task. A relative gain of 11% with respect to the baseline system is achieved.
引用
收藏
页码:856 / 860
页数:5
相关论文
共 8 条
[1]  
FISCHER V, 2002, P ICSLP
[2]   Maximum a Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains [J].
Gauvain, Jean-Luc ;
Lee, Chin-Hui .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (02) :291-298
[3]  
JIANG WU, 2001, P EUR, P1261
[4]   MAXIMUM-LIKELIHOOD LINEAR-REGRESSION FOR SPEAKER ADAPTATION OF CONTINUOUS DENSITY HIDDEN MARKOV-MODELS [J].
LEGGETTER, CJ ;
WOODLAND, PC .
COMPUTER SPEECH AND LANGUAGE, 1995, 9 (02) :171-185
[5]   Speaker clustering and transformation for speaker adaptation in speech recognition systems [J].
Padmanabhan, M ;
Bahl, LR ;
Nahamoo, D ;
Picheny, MA .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1998, 6 (01) :71-77
[6]   ROBUST TEXT-INDEPENDENT SPEAKER IDENTIFICATION USING GAUSSIAN MIXTURE SPEAKER MODELS [J].
REYNOLDS, DA ;
ROSE, RC .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1995, 3 (01) :72-83
[7]   Speaker verification using adapted Gaussian mixture models [J].
Reynolds, DA ;
Quatieri, TF ;
Dunn, RB .
DIGITAL SIGNAL PROCESSING, 2000, 10 (1-3) :19-41
[8]  
SANKAR A, 1995, P EUR, P502