SNR-Invariant Multitask Deep Neural Networks for Robust Speaker Verification

被引：7

作者：

Yao, Qi ^{[1
]}

Mak, Man-Wai ^{[1
]}

机构：

[1] Hong Kong Polytech Univ, Dept Elect & Informat Engn, Hong Kong, Hong Kong, Peoples R China

来源：

IEEE SIGNAL PROCESSING LETTERS | 2018年 / 25卷 / 11期

关键词：

Deep learning; i-vectors; multitask learning; noise robustness; speaker verification; NOISE; PLDA;

D O I：

10.1109/LSP.2018.2870726

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

A major challenge in speaker verification is to achieve low error rates under noisy environments. We observed that background noise in utterances will not only enlarge the speakerdependent i-vector clusters but also shift the clusters, with the amount of shift depending on the signal-to-noise ratio (SNR) of the utterances. To overcome this SNR-dependent clustering phenomenon, we propose two deep neural network (DNN) architectures: hierarchical regression DNN (H-RDNN) and multitask DNN (MT-DNN). The H-RDNN is formed by stacking two regression DNNs in which the lower DNN is trained to map noisy i-vectors to their respective speaker-dependent cluster means of clean i-vectors and the upper DNN aims to regularize the outliers that cannot be denoised properly by the lower DNN. The MT-DNN is trained to denoise i-vectors (main task) and classify speakers (auxiliary task). The network leverages the auxiliary task to retain speaker information in the denoised i-vectors. Experimental results suggest that these two DNN architectures together with the PLDA backend significantly outperform the multicondition PLDA model and mixtures of PLDA, and that multitask learning helps to boost verification performance.

引用

页码：1670 / 1674

页数：5

共 50 条

[31] GENERATIVE ADVERSARIAL SPEAKER EMBEDDING NETWORKS FOR DOMAIN ROBUST END-TO-END SPEAKER VERIFICATION
Bhattacharya, Gautam
Monteiro, Joao
Alam, Jahangir
Kenny, Patrick
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6226 - 6230
[32] DEEP NEURAL NETWORK-BASED SPEAKER EMBEDDINGS FOR END-TO-END SPEAKER VERIFICATION
Snyder, David
Ghahremani, Pegah
Povey, Daniel
Garcia-Romero, Daniel
Carmiel, Yishay
Khudanpur, Sanjeev
2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 165 - 170
[33] A Deep Neural Network Speaker Verification System Targeting Microphone Speech
Lei, Yun
Ferrer, Luciana
McLaren, Mitchell
Scheffer, Nicolas
15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 681 - 685
[34] Total Variability Layer in Deep Neural Network Embeddings for Speaker Verification
Travadi, Ruchir
Narayanan, Shrikanth
IEEE SIGNAL PROCESSING LETTERS, 2019, 26 (06) : 893 - 897
[35] Deep Neural Network Embeddings for Text-Independent Speaker Verification
Snyder, David
Garcia-Romero, Daniel
Povey, Daniel
Khudanpur, Sanjeev
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 999 - 1003
[36] DeepDyve: Dynamic Verification for Deep Neural Networks
Li, Yu
Li, Min
Luo, Bo
Tian, Ye
Xu, Qiang
CCS '20: PROCEEDINGS OF THE 2020 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2020, : 101 - 112
[37] Text-independent speaker verification using predictive neural networks
Finan, RA
Sapeluk, AT
Damper, RI
FIFTH INTERNATIONAL CONFERENCE ON ARTIFICIAL NEURAL NETWORKS, 1997, (440): : 274 - 279
[38] OPTIMIZED POWER NORMALIZED CEPSTRAL COEFFICIENTS TOWARDS ROBUST DEEP SPEAKER VERIFICATION
Liu, Xuechen
Sahidullah, Md
Kinnunen, Tomi
2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 185 - 190
[39] MULTITASK CLASSIFICATION OF REMOTE SENSING SCENES USING DEEP NEURAL NETWORKS
Alhichri, Haikel
IGARSS 2018 - 2018 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2018, : 1195 - 1198
[40] DEEP NEURAL NETWORK BASED DISCRIMINATIVE TRAINING FOR I-VECTOR/PLDA SPEAKER VERIFICATION
Zheng Tieran
Han Jiqing
Zheng Guibin
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5354 - 5358

← 1 2 3 4 5 →