Support vector machines using GMM supervectors for speaker verification

被引:703
作者
Campbell, WM [1 ]
Sturim, DE [1 ]
Reynolds, DA [1 ]
机构
[1] MIT, Lincoln Lab, Lexington, MA 02420 USA
关键词
Gaussian mixture models (GMMs); speaker recognition; support vector machines (SVMs);
D O I
10.1109/LSP.2006.870086
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Gaussian mixture models (GMMs) have proven extremely successful for text-independent speaker recognition. The standard training method for GMM models is to use MAP adaptation of the means of the mixture components based on speech from a target speaker. Recent methods in compensation for speaker and channel variability have proposed the idea of stacking the means of the GMM model to form a GMM mean supervector. We examine the idea of using the GMM supervector in a support vector machine (SVM) classifier. We propose two new SVM kernels based on distance metrics between GMM models. We show that these SVM kernels produce excellent classification accuracy in a NIST speaker recognition evaluation task.
引用
收藏
页码:308 / 311
页数:4
相关论文
共 15 条
[1]  
[Anonymous], 2004, ODYSSEY SPEAKER LANG
[2]  
[Anonymous], P ICSLP
[3]  
Campbell WM, 2002, INT CONF ACOUST SPEE, P161
[4]   SVMTorch: Support vector machines for large-scale regression problems [J].
Collobert, R ;
Bengio, S .
JOURNAL OF MACHINE LEARNING RESEARCH, 2001, 1 (02) :143-160
[5]  
Conway JB., 1990, FUNCTIONAL ANAL
[6]   Fast approximation of Kullback-Leibler distance for dependence trees and hidden Markov models [J].
Do, MN .
IEEE SIGNAL PROCESSING LETTERS, 2003, 10 (04) :115-118
[7]   Eigenvoice modeling with sparse training data [J].
Kenny, P ;
Boulianne, G ;
Dumouchel, P .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (03) :345-354
[8]  
MORENO PJ, 2004, ADV NEURAL INFORMATI, V16
[9]  
NELLO C, 2000, SUPPORT VECTOR MACH
[10]  
Reynolds DA, 2003, INT CONF ACOUST SPEE, P53