Nuance - Politecnico di Torino's 2016 NIST Speaker Recognition Evaluation System

被引：6

作者：

Colibro, Daniele ^{[1
]}

Vair, Claudio ^{[1
]}

Dalmasso, Emanuele ^{[1
]}

Farrell, Kevin ^{[1
]}

Karvitsky, Gennady ^{[1
]}

Cumani, Sandro ^{[2
]}

Laface, Pietro ^{[2
]}

机构：

[1] Nuance Commun Inc, Burlington, MA 01803 USA

[2] Politecn Torino, Turin, Italy

来源：

18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION | 2017年

关键词：

Speaker Recognition; i-vector; PLDA; PSVM; AS-Norm; Top-Norm;

D O I：

10.21437/Interspeech.2017-797

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper describes the Nuance-Politecnico di Torino (NPT) speaker recognition system submitted to the NIST SRE16 evaluation campaign. Included are the results of post evaluation tests, focusing on the analysis of the performance of generative and discriminative classifiers, and of score normalization. The submitted system combines the results of four GMM-IVector models. two DNN-IVector models and a GMM-SVM acoustic system. Each system exploits acoustic front-end parameters that differ by feature type and dimension. We analyze the main components of our submission, which contributed to obtaining 8.1% EER and 0.532 actual. C-primary in the challenging SRE16 Fixed condition.

引用

页码：1338 / 1342

页数：5

共 17 条

[1] [Anonymous], SPEAK REC EV 2016 NA
[2] [Anonymous], ICASSP
[3] [Anonymous], P IEEE INT C AC SPEE
[4] [Anonymous], 2011, P INT 11
[5] Campbell WM, 2006, INT CONF ACOUST SPEE, P97
[6] Compensation of nuisance factors for speaker and language recognition
Castaldo, Fabio
Colibro, Daniele
Dalmasso, Emanuele
Laface, Pietro
Vair, Claudio
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (07): : 1969 - 1978
[7] Cumani S, 2017, INT CONF ACOUST SPEE, P5435, DOI 10.1109/ICASSP.2017.7953195
[8] Large-Scale Training of Pairwise Support Vector Machines for Speaker Recognition
Cumani, Sandro
Laface, Pietro
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (11) : 1590 - 1600
[9] COMPARISON OF PARAMETRIC REPRESENTATIONS FOR MONOSYLLABIC WORD RECOGNITION IN CONTINUOUSLY SPOKEN SENTENCES
DAVIS, SB
MERMELSTEIN, P
[J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1980, 28 (04): : 357 - 366
[10] Front-End Factor Analysis for Speaker Verification
Dehak, Najim
Kenny, Patrick J.
Dehak, Reda
Dumouchel, Pierre
Ouellet, Pierre
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04): : 788 - 798

← 1 2 →