Efficient speaker recognition using approximated cross entropy (ACE)

被引:15
作者
Aronowitz, Hagai
Burshtein, David
机构
[1] Bar Ilan Univ, Dept Comp Sci, IL-52900 Ramat Gan, Israel
[2] Tel Aviv Univ, Sch Elect Engn, IL-69978 Tel Aviv, Israel
来源
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2007年 / 15卷 / 07期
关键词
speaker identification; speaker indexing; speaker recognition; speaker retrieval; speaker verification;
D O I
10.1109/TASL.2007.902059
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Techniques for efficient speaker recognition are presented. These techniques are based on approximating Gaussian mixture modeling (GMM) likelihood scoring using approximated cross entropy (ACE). Gaussian mixture modeling is used for representing both training and test sessions and is shown to perform speaker recognition and retrieval extremely efficiently without any notable degradation in accuracy compared to classic GMM-based recognition. In addition, a GMM compression algorithm is presented. This algorithm decreases considerably the storage needed for speaker retrieval.
引用
收藏
页码:2033 / 2043
页数:11
相关论文
共 40 条
[1]  
A Reynolds D., 1992, GAUSSIAN MIXTURE MOD
[2]  
[Anonymous], 1997, Proceedings of the uropean Conference on Speech Communication and Technology
[3]  
[Anonymous], P IEEE OD 2006 SPEAK
[4]  
[Anonymous], 1999, PROC 6 EUR C SPEECH
[5]  
[Anonymous], P INTERSPEECH
[6]  
ARONOWITZ D, 2004, P INT, P1789
[7]  
ARONOWITZ H, 2007, IN PRESS P INT
[8]  
ARONOWITZ H, 2005, P ICASSP, P729
[9]  
ARONOWITZ H, 2004, MLMI P WORKSH MACH L, P243
[10]  
Aronowitz H., 2005, P INT, P2433