Utterance partitioning with acoustic vector resampling for GMM-SVM speaker verification

被引:28
作者
Mak, Man-Wai [1 ]
Rao, Wei [1 ]
机构
[1] Hong Kong Polytech Univ, Elect & Informat Engn Dept, Ctr Signal Proc, Hong Kong, Hong Kong, Peoples R China
关键词
Speaker verification; GMM-supervectors (GSV); Utterance partitioning; GMM-SVM; Support vector machine; Random resampling; Data imbalance; MACHINES; ENSEMBLE;
D O I
10.1016/j.specom.2010.06.011
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Recent research has demonstrated the merit of combining Gaussian mixture models and support vector machine (SVM) for text-independent speaker verification. However, one unaddressed issue in this GMM-SVM approach is the imbalance between the numbers of speaker-class utterances and impostor-class utterances available for training a speaker-dependent SVM. This paper proposes a resampling technique - namely utterance partitioning with acoustic vector resampling (UP-AVR) - to mitigate the data imbalance problem. Briefly, the sequence order of acoustic vectors in an enrollment utterance is first randomized, which is followed by partitioning the randomized sequence into a number of segments. Each of these segments is then used to produce a GM M supervector via MAP adaptation and mean vector concatenation. The randomization and partitioning processes are repeated several times to produce a sufficient number of speaker-class supervectors for training an SVM. Experimental evaluations based on the NIST 2002 and 2004 SRE suggest that UP-AVR can reduce the error rate of GMM-SVM systems. (C) 2010 Elsevier B.V. All rights reserved.
引用
收藏
页码:119 / 130
页数:12
相关论文
共 36 条
[1]  
[Anonymous], 1999, Proceedings of the International Joint Conference on Artificial Intelligence
[2]  
[Anonymous], 2004, LREC
[3]  
[Anonymous], P EUR GEN SWITZ SEPT
[4]   EFFECTIVENESS OF LINEAR PREDICTION CHARACTERISTICS OF SPEECH WAVE FOR AUTOMATIC SPEAKER IDENTIFICATION AND VERIFICATION [J].
ATAL, BS .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1974, 55 (06) :1304-1312
[5]   Score normalization for text-independent speaker verification systems [J].
Auckenthaler, R ;
Carey, M ;
Lloyd-Thomas, H .
DIGITAL SIGNAL PROCESSING, 2000, 10 (1-3) :42-54
[6]  
Bar-Yosef Y, 2009, INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, P1279
[7]  
Bolle R.M., 2004, SPR PRO COM, V1a, DOI 10.1007/978-1-4757-4036-3
[8]  
BOLLE RM, 1999, P AUTOID 99, P9
[9]  
Bonastre JF, 2005, INT CONF ACOUST SPEE, P737
[10]  
Campbell WM, 2006, INT CONF ACOUST SPEE, P97