Text-Independent Speaker Identification Using Vocal Tract Length Normalization for Building Universal Background Model

被引:0
作者
Sarkar, A. K. [1 ]
Umesh, S. [1 ]
Rath, S. P. [1 ]
机构
[1] Indian Inst Technol, Dept Elect Engn, Kanpur 208016, Uttar Pradesh, India
来源
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5 | 2009年
关键词
Speaker Identification; VTLN; Iterative MAP; UBM;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose to use Vocal Tract Length Normalization (VTLN) to build the Universal Background Model (UBM) for a closed set speaker identification system. Vocal Tract Length (VTL) differences among speakers is a major source of variability in the speech signal. Since the UBM model is trained using data from many speakers, it statistically captures this inherent variation in the speech signal, which results in a "coarse-model in the acoustic space. This may cause the adapted speaker models obtained from the UBM model to have significantly high overlap in the acoustic space. We hypothesize that the use of VTLN will help in compacting the UBM model and thus the speaker adapted models obtained from this compact model will have better speaker-separability in the acoustic space. We perform experiments on MIT. TIMIT and NIST 2004 SRE databases and show that using VTLN we can achieve lesser Identification Error Rates as compared to the conventional GMM-UBM based method.
引用
收藏
页码:2311 / 2314
页数:4
相关论文
共 9 条
  • [1] Akhil P. T., 2008, INTERSPEECH2008
  • [2] [Anonymous], 2006, P IEEE OD SPEAK LANG
  • [3] Bonastre Jean Francois, 2004, NIST SRE 04 WORKSH T
  • [4] Ferràs M, 2007, INT CONF ACOUST SPEE, P53
  • [5] A frequency warping approach to speaker normalization
    Lee, L
    Rose, R
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1998, 6 (01): : 49 - 60
  • [6] *NIST, EV PLAN NIST 2004 SP
  • [7] SPEAKER IDENTIFICATION AND VERIFICATION USING GAUSSIAN MIXTURE SPEAKER MODELS
    REYNOLDS, DA
    [J]. SPEECH COMMUNICATION, 1995, 17 (1-2) : 91 - 108
  • [8] Sanand D. R., 2008, INTERSPEECH 2008
  • [9] YOUNG S, HTK BOOK