IMPROVING SPEAKER RECOGNITION PERFORMANCE IN THE DOMAIN ADAPTATION CHALLENGE USING DEEP NEURAL NETWORKS

被引：0

作者：

Garcia-Romero, Daniel ^{[1
]}

Zhang, Xiaohui

McCree, Alan

Povey, Daniel

机构：

[1] Johns Hopkins Univ, Human Language Technol Ctr Excellence, Baltimore, MD 21218 USA

来源：

2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014 | 2014年

关键词：

Unsupervised adaptation; speaker recognition; i-vectors; deep neural networks;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Traditional i-vector speaker recognition systems use a Gaussian mixture model (GMM) to collect sufficient statistics (SS). Recently, replacing this GMM with a deep neural network (DNN) has shown promising results. In this paper, we explore the use of DNNs to collect SS for the unsupervised domain adaptation task of the Domain Adaptation Challenge (DAC). We show that collecting SS with a DNN trained on out-of-domain data boosts the speaker recognition performance of an out-of-domain system by more than 25%. Moreover, we integrate the DNN in an unsupervised adaptation framework, that uses agglomerative hierarchical clustering with a stopping criterion based on unsupervised calibration, and show that the initial gains of the out-of-domain system carry over to the final adapted system. Despite the fact that the DNN is trained on the out-of-domain data, the final adapted system produces a relative improvement of more than 30% with respect to the best published results on this task.

引用

页码：378 / 383

页数：6

共 50 条

[1] Insights into Deep Neural Networks for Speaker Recognition
Garcia-Romero, Daniel
McCree, Alan
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1141 - 1145
[2] Contrastive Adversarial Domain Adaptation Networks for Speaker Recognition
Li, Longxin
Mak, Man-Wai
Chien, Jen-Tzung
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (05) : 2236 - 2245
[3] SPEAKER ADAPTATION OF CONTEXT DEPENDENT DEEP NEURAL NETWORKS
Liao, Hank
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7947 - 7951
[4] Improving Deep Neural Networks Based Speaker Verification Using Unlabeled Data
Tian, Yao
Cai, Meng
He, Liang
Zhang, Wei-Qiang
Liu, Jia
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1863 - 1867
[5] DOMAIN ADAPTATION OF DEEP NEURAL NETWORKS FOR AUTOMATIC SPEECH RECOGNITION VIA WIRELESS SENSORS
Gosztolya, Gabor
Grosz, Tamas
JOURNAL OF ELECTRICAL ENGINEERING-ELEKTROTECHNICKY CASOPIS, 2016, 67 (02): : 124 - 130
[6] Speaker2Vec: Unsupervised Learning and Adaptation of a Speaker Manifold using Deep Neural Networks with an Evaluation on Speaker Segmentation
Jati, Arindam
Georgiou, Panayiotis
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3567 - 3571
[7] MODELLING SPEAKER AND CHANNEL VARIABILITY USING DEEP NEURAL NETWORKS FOR ROBUST SPEAKER VERIFICATION
Bhattacharya, Gautam
Alam, Jahangir
Kenny, Patrick
Gupta, Vishwa
2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 192 - 198
[8] OliVaR: Improving olive variety recognition using deep neural networks
Miho, Hristofor
Pagnotta, Giulio
De Gaspari, Fabio
Hitaj, Dorjan
Mancini, Luigi Vincenzo
Koubouris, Georgios
Godino, Gianluca
Hakan, Mehmet
Diez, Concepcion Munoz
COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2024, 216
[9] Approaches for Out-of-Domain Adaptation to Improve Speaker Recognition Performance
Shulipa, Andrey
Novoselov, Sergey
Melnikov, Aleksandr
SPEECH AND COMPUTER, 2016, 9811 : 124 - 130
[10] Application of Convolutional Neural Networks to Speaker Recognition in Noisy Conditions
McLaren, Mitchell
Lei, Yun
Scheffer, Nicolas
Ferrer, Luciana
15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 686 - 690

← 1 2 3 4 5 →