BUT Text-Dependent Speaker Verification System for SdSV Challenge 2020

被引:6
作者
Lozano-Diez, Alicia [1 ]
Silnova, Anna [1 ]
Pulugundla, Bhargav [1 ]
Rohdin, Johan [1 ]
Vesely, Karel [1 ]
Burget, Lukas [1 ]
Plchot, Oldrich [1 ]
Glembek, Ondrej [1 ]
Novotny, Ondvrej [1 ]
Matejka, Pavel [1 ]
机构
[1] Brno Univ Technol, Fac Informat Technol, IT4I Ctr Excellence, Brno, Czech Republic
来源
INTERSPEECH 2020 | 2020年
基金
美国国家科学基金会;
关键词
text-dependent speaker verification; phrase-dependent PLDA; phrase recognizer; I-VECTORS;
D O I
10.21437/Interspeech.2020-2882
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
In this paper, we present the winning BUT submission for the text-dependent task of the SdSV challenge 2020. Given the large amount of training data available in this challenge, we explore successful techniques from text-independent systems in the text-dependent scenario. In particular, we trained x-vector extractors on both in-domain and out-of-domain datasets and combine them with i-vectors trained on concatenated MFCCs and bottleneck features, which have proven effective for the text-dependent scenario. Moreover, we proposed the use of phrase-dependent PLDA backend for scoring and its combination with a simple phrase recognizer, which brings up to 63% relative improvement on our development set with respect to using standard PLDA. Finally, we combine our different i-vector and x-vector based systems using a simple linear logistic regression score level fusion, which provides 28% relative improvement on the evaluation set with respect to our best single system.
引用
收藏
页码:761 / 765
页数:5
相关论文
共 28 条
[1]  
[Anonymous], 2012, ODYSSEY
[2]  
Boulianne D., 2011, IEEE 2011 WORKSH AUT, P1, DOI DOI 10.1017/CBO9781107415324.004
[3]  
Chung JS, 2018, INTERSPEECH, P1086
[4]   Front-End Factor Analysis for Speaker Verification [J].
Dehak, Najim ;
Kenny, Patrick J. ;
Dehak, Reda ;
Dumouchel, Pierre ;
Ouellet, Pierre .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04) :788-798
[5]  
Ghahremani Pegah, 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), P2494, DOI 10.1109/ICASSP.2014.6854049
[6]  
Grezl Frantisek, 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), P7654, DOI 10.1109/ICASSP.2014.6855089
[7]  
Grézl F, 2009, INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, P2915
[8]  
Kenny P., 2010, ODYSSEY
[9]  
Lozano-Diez A, 2016, Odyssey 2016, P352, DOI [10.21437/Odyssey.2016-51, DOI 10.21437/ODYSSEY.2016-51]
[10]  
Martínez D, 2011, 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, P868