THE EFFECT OF LANGUAGE FACTORS FOR ROBUST SPEAKER RECOGNITION

被引:0
作者
Lu, Liang [1 ]
Dong, Yuan [1 ,2 ]
Zhao, Xianyu [2 ]
Liu, Jiqing [1 ]
Wang, Haila [2 ]
机构
[1] Beijing Univ Posts & Telecommun, Beijing 100876, Peoples R China
[2] France Telecom Res & Dev Ctr, Beijing 100083, Peoples R China
来源
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS | 2009年
关键词
Speaker recognition; Joint Factor Analysis; Eigenchannels; Language Factor Compensation;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
From the results of the NIST speaker recognition evaluation in resent years, speaker recognition systems which are mainly developed based on English training data suffer the language gap problem, namely, the performance of non-English trails is much worse than that of English trails. This problem is addressed in this paper. Based on the conventional joint factor analysis model, we enrolled in the language factors which are mean to capture the language character of each testing and training speech utterance, and compensation was carried out by removing the language factors in order to shrink the difference between languages. Experiments on 2006 NIST SRE data show that, the language factor compensation alone can reduce the gap between the performance of English and non-English trails, and the score level combination with eigenchannels can further improve the performance of non-English trails, e.g., for female part, we observed about 19% relatively reduction in EER, when compared with eigenchannels session variability compensation alone.
引用
收藏
页码:4217 / +
页数:2
相关论文
共 12 条
  • [1] [Anonymous], P INT C ANT BELG
  • [2] [Anonymous], NIST 2006 SPEAK REC
  • [3] [Anonymous], P IEEE INT C AC SPEE
  • [4] Fusion of heterogeneous speaker recognition systems in the STBU submission for the NIST speaker recognition evaluation 2006
    Bruemmer, Niko
    Burget, Lukas
    Cernocky, Jan 'Honza'
    Glembek, Ondrej
    Grezl, Frantisek
    Karafiat, Martin
    van Leeuwen, David A.
    Matejka, Pavel
    Schwarz, Petr
    Strasheim, Albert
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (07): : 2072 - 2084
  • [5] Campbell WM, 2006, INT CONF ACOUST SPEE, P97
  • [6] Eigenvoice modeling with sparse training data
    Kenny, P
    Boulianne, G
    Dumouchel, P
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (03): : 345 - 354
  • [7] KENNY P, 2008, IEEE T SPEECH AU JUL
  • [8] KENNY P, 2005, GRIM060813
  • [9] LU L, 2008, P INT
  • [10] MUTHUSAMY YK, 1992, P ICSLP BANFF ALB CA