Quality measures for speaker verification with short utterances

被引:15
作者
Poddar, Arnab [1 ]
Sahidullah, Md [2 ]
Saha, Goutam [1 ]
机构
[1] Indian Inst Technol Kharagpur, Dept Elect & Elect Commun Engn, Kharagpur 721302, W Bengal, India
[2] Univ Lorraine, CNRS, INRIA, MULTISPEECH Team,LORIA, F-54000 Nancy, France
关键词
Gaussian mixture model (GMM); Identity vector (i-vector); Short utterances; Speaker verification; Total variability; Universal background model (UBM); RECOGNITION; COMPUTATION; FUSION; MFCC;
D O I
10.1016/j.dsp.2019.01.023
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The performances of the automatic speaker verification (ASV) systems degrade due to the reduction in the amount of speech used for enrollment and verification. Combining multiple systems based on different features and classifiers considerably reduces speaker verification error rate with short utterances. This work attempts to incorporate supplementary information during the system combination process. We use quality of the estimated model parameters as supplementary information. We introduce a class of novel quality measures formulated using the zero-order sufficient statistics used during the i-vector extraction process. We have used the proposed quality measures as side information for combining ASV systems based on Gaussian mixture model-universal background model (GMM-UBM) and i-vector. The proposed methods demonstrate considerable improvement in speaker recognition performance on NIST SRE corpora, especially in short duration conditions. We have also observed improvement over existing systems based on different duration-based quality measures. (C) 2019 Elsevier Inc. All rights reserved.
引用
收藏
页码:66 / 79
页数:14
相关论文
共 60 条
[1]   Quality Measures in Biometric Systems [J].
Alonso-Fernandez, Fernando ;
Fierrez, Julian ;
Ortega-Garcia, Javier .
IEEE SECURITY & PRIVACY, 2012, 10 (06) :52-62
[2]  
[Anonymous], 2018, P ICASSP
[3]  
[Anonymous], 2003, P 12 INT C IM AN PRO
[4]  
[Anonymous], 2012, PROC ODYSSEY
[5]  
[Anonymous], P OD SPEAK LANG REC
[6]  
[Anonymous], 2010, OD 2010 SPEAK LANG R
[7]  
[Anonymous], 2007, P INTERSPEECH
[8]  
[Anonymous], ARXIV12100297
[9]   Forensic Speaker Recognition A need for caution [J].
Campbell, Joseph P. ;
Shen, Wade ;
Campbell, William M. ;
Schwartz, Reva ;
Bonastre, Jean-Francois ;
Matrouf, Driss .
IEEE SIGNAL PROCESSING MAGAZINE, 2009, 26 (02) :95-103
[10]   Speaker recognition: A tutorial [J].
Campbell, JP .
PROCEEDINGS OF THE IEEE, 1997, 85 (09) :1437-1462