Speaker Recognition Based on Variational Bayesian Method

被引:0
作者
Ito, Tatsuya [1 ]
Hashimoto, Kei [1 ]
Nankaku, Yoshihiko [1 ]
Lee, Akinobu [1 ]
Tokuda, Keiichi [1 ]
机构
[1] Nagoya Inst Technol, Dept Comp Sci & Engn, Nagoya, Aichi, Japan
来源
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5 | 2008年
关键词
speaker recognition; GMM; variational bayesian method;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a speaker identification system based on Gaussian Mixture Models (GMM) using the variational Bayesian method. Maximum Likelihood (ML) and Maximum A Posterior (MAP) are well-known methods for estimating GMM parameters. However, the overtraining problem occurs with insufficient data due to a point estimate of model parameters. The Bayesian approach estimates a posterior distribution of model parameters and achieves a more robust prediction than ML and MAP approach. To solve complicated integral calculations in the Bayesian approach, the variational Bayesian method has been proposed and applied to many classification problems using latent variable models. However, the performance of the Bayesian approach has not been extensively investigated in large speaker identification tasks. The experimental results shows that the VB method improves the overtraining problem than the conventional ML and MAP methods.
引用
收藏
页码:1417 / 1420
页数:4
相关论文
共 8 条
[1]  
Dempster A. P., 1977, J R STAT SOC B, V38, P1
[2]  
Gelman A, 2013, BAYESIAN DATA ANAL, DOI DOI 10.1201/9780429258411
[3]  
Ghahramani Z, 2000, ADV NEUR IN, V12, P449
[4]  
Ghahramani Z, 1997, STRUCTURED VARIATION
[5]   An introduction to variational methods for graphical models [J].
Jordan, MI ;
Ghahramani, Z ;
Jaakkola, TS ;
Saul, LK .
MACHINE LEARNING, 1999, 37 (02) :183-233
[6]   SPEAKER IDENTIFICATION AND VERIFICATION USING GAUSSIAN MIXTURE SPEAKER MODELS [J].
REYNOLDS, DA .
SPEECH COMMUNICATION, 1995, 17 (1-2) :91-108
[7]   ROBUST TEXT-INDEPENDENT SPEAKER IDENTIFICATION USING GAUSSIAN MIXTURE SPEAKER MODELS [J].
REYNOLDS, DA ;
ROSE, RC .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1995, 3 (01) :72-83
[8]  
REYNOLDS DA, 1997, COMP BACKGROUND NORM, P963