Structural MAP Adaptation in GMM-Supervector based Speaker Recognition

被引:0
作者
Ferras, Marc [1 ]
Shinoda, Koichi [1 ]
Furui, Sadaoki [1 ]
机构
[1] Tokyo Inst Technol, Dept Comp Sci, Meguro Ku, Tokyo 1528552, Japan
来源
2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2011年
关键词
speaker recognition; SMAP; MAP; GMM-SVM; VERIFICATION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In recent years, adaptation techniques have been given special focus in speaker recognition tasks, mainly targeting speaker and session variation disentangling under the Maximum a Posteriori (MAP) criterion. For these techniques, unseen mixtures are usually adapted in a global manner, if ever. In this paper, we explore Structural MAP (SMAP), Maximum a Posteriori adaptation using hierarchical structures of the acoustic space that allow data scarceness issues to be tackled with different precision levels. We explore this approach in a speaker verification system using a Support Vector Machine (SVM) classifier and Gaussian mean supervectors (GMM-SVM). We show that this is an effective approach that considerably outperforms its relevance MAP counterpart in the 2006 NIST Speaker Recognition Evaluation. We also show that using a speaker-adapted Universal Background Model can improve the stability of the clustering algorithm besides obtaining further improvements.
引用
收藏
页码:5432 / 5435
页数:4
相关论文
共 13 条
[1]  
Anastasakos T., 1996, P INT C SPOK LANG PR, V2, P1137
[2]  
[Anonymous], 1997, P EUR C SPEECH COMM
[3]  
[Anonymous], P INT C AC SPEECH SI
[4]  
Bellot O., 2003, P IEEE ICASSP, P121
[5]   Support vector machines using GMM supervectors for speaker verification [J].
Campbell, WM ;
Sturim, DE ;
Reynolds, DA .
IEEE SIGNAL PROCESSING LETTERS, 2006, 13 (05) :308-311
[6]  
Dehak N., 2009, FRONT END FACTOR ANA
[7]   SPEAKER ADAPTATION USING CONSTRAINED ESTIMATION OF GAUSSIAN MIXTURES [J].
DIGALAKIS, VV ;
RTISCHEV, D ;
NEUMEYER, LG .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1995, 3 (05) :357-366
[8]   Maximum a Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains [J].
Gauvain, Jean-Luc ;
Lee, Chin-Hui .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (02) :291-298
[9]  
Liu M., 2002, P INT C SPOK LANG PR, P1353
[10]  
Pelecanos J., 2001, P IEEE SPEAK OD WORK