ANALYSIS OF THE DNN-BASED SRE SYSTEMS IN MULTI-LANGUAGE CONDITIONS

被引:0
作者
Novotny, Ondrej [1 ]
Matejka, Pavel
Glembek, Ondrej
Plchot, Oldrich
Grezl, Frantisek
Burget, Lukas
Cernocky, Jan
机构
[1] Brno Univ Technol, Speech FIT, Brno, Czech Republic
来源
2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016) | 2016年
关键词
DNN; Multi-Language; Speaker Recognition;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper analyzes the behavior of our state-of-the-art Deep Neural Network/i-vector/PLDA-based speaker recognition systems in multi-language conditions. On the "Language Pack" of the PRISM set, we evaluate the systems' performance using the NIST's standard metrics. We show that not only the gain from using DNNs vanishes, nor using dedicated DNNs for target conditions helps, but also the DNN-based systems tend to produce de-calibrated scores under the studied conditions. This work gives suggestions for directions of future research rather than any particular solutions to these issues.
引用
收藏
页码:199 / 204
页数:6
相关论文
共 26 条
[1]  
[Anonymous], 1995, Speech coding and synthesis
[2]  
[Anonymous], INTERSPEECH 2013
[3]  
Cumani Sandro, 2016, ICASSP
[4]  
Dehak N., 2010, AUDIO SPEECH LANGUAG
[5]  
Fer Radek, 2015, INTERSPEECH 2015
[6]  
Ferrer L., 2011, PROC NIST SPEAKER RE, P1
[7]  
Garcia-Romero D., 2014, SLT
[8]  
Harper M., 2013, ASRU 2013
[9]   Deep Neural Networks for Acoustic Modeling in Speech Recognition [J].
Hinton, Geoffrey ;
Deng, Li ;
Yu, Dong ;
Dahl, George E. ;
Mohamed, Abdel-rahman ;
Jaitly, Navdeep ;
Senior, Andrew ;
Vanhoucke, Vincent ;
Patrick Nguyen ;
Sainath, Tara N. ;
Kingsbury, Brian .
IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (06) :82-97
[10]  
Karafiát M, 2013, INTERSPEECH, P2588