Speaker Weight Estimation from Speech Signals Using a Fusion of the i-vector and NFA Frameworks

被引:0
作者
Poorjam, Amir Hossein [1 ]
Bahari, Mohamad Hasan [1 ]
Van Hamme, Hugo [1 ]
机构
[1] Katholieke Univ Leuven, Ctr Proc Speech & Images, Leuven, Belgium
来源
2015 INTERNATIONAL SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND SIGNAL PROCESSING (AISP) | 2015年
关键词
i-vector; Non-negative Factor Analysis; Least-Squares Support Vector Regression; Speaker Weight Estimation; HEIGHT;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, a novel approach for automatic speaker weight estimation from spontaneous telephone speech signals is proposed. In this method, each utterance is modeled using the i-vector framework which is based on the factor analysis on Gaussian Mixture Model (GMM) mean supervectors, and the Non-negative Factor Analysis (NFA) framework which is based on a constrained factor analysis on GMM weights. Then, the available information in both Gaussian means and Gaussian weights is exploited through a feature-level fusion of the i-vectors and the NFA vectors. Finally, a least-squares support vector regression (LS-SVR) is employed to estimate the weight of speakers from given utterances. The proposed approach is evaluated on the telephone speech signals of National Institute of Standards and Technology (NIST) 2008 and 2010 Speaker Recognition Evaluation (SRE) corpora. Experimental results over 2339 utterances show that the correlation coefficients between actual and estimated weights of male and female speakers are 0.56 and 0.49, respectively, which indicate the effectiveness of the proposed method in speaker weight estimation.
引用
收藏
页码:118 / 123
页数:6
相关论文
共 28 条
[1]  
[Anonymous], 1980, THESIS MIT
[2]  
[Anonymous], P OD
[3]  
[Anonymous], 1949, COMP ANATOMY PHYSL L
[4]   Non-Negative Factor Analysis of Gaussian Mixture Model Weight Adaptation for Language and Dialect Recognition [J].
Bahari, Mohamad Hasan ;
Dehak, Najim ;
Van Hamme, Hugo ;
Burget, Lukas ;
Ali, Ahmed M. ;
Glass, Jim .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (07) :1117-1129
[5]  
Bahari MH, 2013, INT CONF ACOUST SPEE, P7344, DOI 10.1109/ICASSP.2013.6639089
[6]  
Bahari MH, 2012, 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, P506
[7]  
Bahari Mohamad Hasan, 2014, THESIS
[8]   Support vector machines using GMM supervectors for speaker verification [J].
Campbell, WM ;
Sturim, DE ;
Reynolds, DA .
IEEE SIGNAL PROCESSING LETTERS, 2006, 13 (05) :308-311
[9]  
Darwin C., 1871, P475
[10]  
De Brabanter K., LS SVMLAB1 8 TOOLBOX