ANALYSIS OF THE DNN-BASED SRE SYSTEMS IN MULTI-LANGUAGE CONDITIONS

被引：0

作者：

Novotny, Ondrej ^{[1
]}

Matejka, Pavel

Glembek, Ondrej

Plchot, Oldrich

Grezl, Frantisek

Burget, Lukas

Cernocky, Jan

机构：

[1] Brno Univ Technol, Speech FIT, Brno, Czech Republic

来源：

2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016) | 2016年

关键词：

DNN; Multi-Language; Speaker Recognition;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper analyzes the behavior of our state-of-the-art Deep Neural Network/i-vector/PLDA-based speaker recognition systems in multi-language conditions. On the "Language Pack" of the PRISM set, we evaluate the systems' performance using the NIST's standard metrics. We show that not only the gain from using DNNs vanishes, nor using dedicated DNNs for target conditions helps, but also the DNN-based systems tend to produce de-calibrated scores under the studied conditions. This work gives suggestions for directions of future research rather than any particular solutions to these issues.

引用

页码：199 / 204

页数：6

共 26 条

[1]

[Anonymous], 1995, Speech coding and synthesis

[2]

[Anonymous], INTERSPEECH 2013

[3]

Cumani Sandro, 2016, ICASSP

[4]

Dehak N., 2010, AUDIO SPEECH LANGUAG

[5]

Fer Radek, 2015, INTERSPEECH 2015

[6]

Ferrer L., 2011, PROC NIST SPEAKER RE, P1

[7]

Garcia-Romero D., 2014, SLT

[8]

Harper M., 2013, ASRU 2013

[9] Deep Neural Networks for Acoustic Modeling in Speech Recognition [J].

Hinton, Geoffrey ;

Deng, Li ;

Yu, Dong ;

Dahl, George E. ;

Mohamed, Abdel-rahman ;

Jaitly, Navdeep ;

Senior, Andrew ;

Vanhoucke, Vincent ;

Patrick Nguyen ;

Sainath, Tara N. ;

Kingsbury, Brian .

IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (06) :82-97

[10]

Karafiát M, 2013, INTERSPEECH, P2588

← 1 2 3 →