Fast variational Bayes for heavy-tailed PLDA applied to i-vectors and x-vectors

被引:0
作者
Silnova, Anna [1 ]
Brummer, Niko [2 ]
Garcia-Romero, Daniel [3 ]
Snyder, David [3 ]
Burget, Lukas [1 ]
机构
[1] Brno Univ Technol, Brno, Czech Republic
[2] Nuance Commun, Stellenbosch, South Africa
[3] Johns Hopkins HLTCOE, Baltimore, MD USA
来源
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES | 2018年
关键词
speaker recognition; variational Bayes; heavy tailed PLDA;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The standard state-of-the-art backend for text-independent speaker recognizers that use i-vectors or x-vectors, is Gaussian PLDA (G-PLDA), assisted by a Gaussianization step involving length normalization. G-PLDA can be trained with both generative or discriminative methods. It has long been known that heavy-tailed PLDA (HT-PLDA), applied without length normalization, gives similar accuracy, but at considerable extra computational cost. We have recently introduced a fast scoring algorithm for a discriminatively trained HT-PLDA backend. This paper extends that work by introducing a fast, variational Bayes, generative training algorithm. We compare old and new backends, with and without length-normalization, with i-vectors and x-vectors, on SRE' 10, SRE'16 and SITW.
引用
收藏
页码:72 / 76
页数:5
相关论文
共 18 条
[1]  
[Anonymous], 2014, ODYSSEY
[2]  
[Anonymous], 2010, The NIST year 2010 speaker recognition evaluation plan
[3]  
[Anonymous], 2011, INTERSPEECH
[4]  
[Anonymous], 2010, TECH REP
[5]   SIGNAL-PROCESSING APPLICATIONS OF OBLIQUE PROJECTION OPERATORS [J].
BEHRENS, RT ;
SCHARF, LL .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1994, 42 (06) :1413-1424
[6]  
Bishop Christopher M, 2016, Pattern recognition and machine learning
[7]  
Brummer N., 2018, CORR
[8]  
Burget L., METAEMBEDDINGS PROBA
[9]   Front-End Factor Analysis for Speaker Verification [J].
Dehak, Najim ;
Kenny, Patrick J. ;
Dehak, Reda ;
Dumouchel, Pierre ;
Ouellet, Pierre .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04) :788-798
[10]  
Ferrer L., 2011, PROC NIST SPEAKER RE, P1