Nuance - Politecnico di Torino's 2016 NIST Speaker Recognition Evaluation System

被引:6
作者
Colibro, Daniele [1 ]
Vair, Claudio [1 ]
Dalmasso, Emanuele [1 ]
Farrell, Kevin [1 ]
Karvitsky, Gennady [1 ]
Cumani, Sandro [2 ]
Laface, Pietro [2 ]
机构
[1] Nuance Commun Inc, Burlington, MA 01803 USA
[2] Politecn Torino, Turin, Italy
来源
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION | 2017年
关键词
Speaker Recognition; i-vector; PLDA; PSVM; AS-Norm; Top-Norm;
D O I
10.21437/Interspeech.2017-797
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes the Nuance-Politecnico di Torino (NPT) speaker recognition system submitted to the NIST SRE16 evaluation campaign. Included are the results of post evaluation tests, focusing on the analysis of the performance of generative and discriminative classifiers, and of score normalization. The submitted system combines the results of four GMM-IVector models. two DNN-IVector models and a GMM-SVM acoustic system. Each system exploits acoustic front-end parameters that differ by feature type and dimension. We analyze the main components of our submission, which contributed to obtaining 8.1% EER and 0.532 actual. C-primary in the challenging SRE16 Fixed condition.
引用
收藏
页码:1338 / 1342
页数:5
相关论文
共 17 条
  • [1] [Anonymous], SPEAK REC EV 2016 NA
  • [2] [Anonymous], ICASSP
  • [3] [Anonymous], P IEEE INT C AC SPEE
  • [4] [Anonymous], 2011, P INT 11
  • [5] Campbell WM, 2006, INT CONF ACOUST SPEE, P97
  • [6] Compensation of nuisance factors for speaker and language recognition
    Castaldo, Fabio
    Colibro, Daniele
    Dalmasso, Emanuele
    Laface, Pietro
    Vair, Claudio
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (07): : 1969 - 1978
  • [7] Cumani S, 2017, INT CONF ACOUST SPEE, P5435, DOI 10.1109/ICASSP.2017.7953195
  • [8] Large-Scale Training of Pairwise Support Vector Machines for Speaker Recognition
    Cumani, Sandro
    Laface, Pietro
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (11) : 1590 - 1600
  • [9] COMPARISON OF PARAMETRIC REPRESENTATIONS FOR MONOSYLLABIC WORD RECOGNITION IN CONTINUOUSLY SPOKEN SENTENCES
    DAVIS, SB
    MERMELSTEIN, P
    [J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1980, 28 (04): : 357 - 366
  • [10] Front-End Factor Analysis for Speaker Verification
    Dehak, Najim
    Kenny, Patrick J.
    Dehak, Reda
    Dumouchel, Pierre
    Ouellet, Pierre
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04): : 788 - 798