PARTIAL AUC OPTIMIZATION BASED DEEP SPEAKER EMBEDDINGS WITH CLASS-CENTER LEARNING FOR TEXT-INDEPENDENT SPEAKER VERIFICATION

被引：0

作者：

Bai, Zhongxin ^{[1
,2
]}

Zhang, Xiao-Lei ^{[1
,2
]}

Chen, Jingdong ^{[1
,2
]}

机构：

[1] Northwestern Polytech Univ, Ctr Intelligent Acoust & Immers Commun, Xian, Peoples R China

[2] Northwestern Polytech Univ, Sch Marine Sci & Technol, Xian, Peoples R China

来源：

2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2020年

基金：

以色列科学基金会; 美国国家科学基金会;

关键词：

speaker verification; pAUC optimization; speaker centers; verification loss; RECOGNITION;

D O I：

10.1109/icassp40776.2020.9053674

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Deep embedding based text-independent speaker verification has demonstrated superior performance to traditional methods in many challenging scenarios. Its loss functions can be generally categorized into two classes, i.e., verification and identification. The verification loss functions match the pipeline of speaker verification, but their implementations are difficult. Thus, most state-of-the-art deep embedding methods use the identification loss functions with softmax output units or their variants. In this paper, we propose a verification loss function, named the maximization of partial area under the Receiver-operating-characteristic (ROC) curve (pAUC), for deep embedding based text-independent speaker verification. We also propose a class-center based training trial construction method to improve the training efficiency, which is critical for the proposed loss function to be comparable to the identification loss in performance. Experiments on the Speaker in the Wild (SITW) and NIST SRE 2016 datasets show that the proposed pAUC loss function is highly competitive with the state-of-the-art identification loss functions.

引用

页码：6819 / 6823

页数：5

共 50 条

[1] Deep Speaker Feature Learning for Text-independent Speaker Verification
Li, Lantian
Chen, Yixiang
Shi, Zing
Tang, Zhiyuan
Wang, Dong
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1542 - 1546
[2] Deep Neural Network Embeddings for Text-Independent Speaker Verification
Snyder, David
Garcia-Romero, Daniel
Povey, Daniel
Khudanpur, Sanjeev
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 999 - 1003
[3] Bayesian Self-Attentive Speaker Embeddings for Text-Independent Speaker Verification
Zhu, Yingke
Mak, Brian
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 1000 - 1012
[4] Deep multi-metric learning for text-independent speaker verification
Xu, Jiwei
Wang, Xinggang
Feng, Bin
Liu, Wenyu
NEUROCOMPUTING, 2020, 410 : 394 - 400
[5] Deep Neural Network Embeddings with Gating Mechanisms for Text-Independent Speaker Verification
You, Lanhua
Guo, Wu
Dai, Li-Rong
Du, Jun
INTERSPEECH 2019, 2019, : 1168 - 1172
[6] On Metric-based Deep Embedding Learning for Text-Independent Speaker Verification
Kashani, Hamidreza Baradaran
Reza, Shaghayegh
Rezaei, Iman Sarraf
2020 6TH IRANIAN CONFERENCE ON SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS), 2020,
[7] Deep Speaker Embedding with Long Short Term Centroid Learning for Text-independent Speaker Verification
Peng, Junyi
Gu, Rongzhi
Zou, Yuexian
INTERSPEECH 2020, 2020, : 3246 - 3250
[8] Text-Independent Speaker Verification Based on Information Theoretic Learning
Memon, Sheeraz
Khanzada, Tariq Jameel Saifullah
Bhatti, Sania
MEHRAN UNIVERSITY RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY, 2011, 30 (03) : 457 - 468
[9] Deep Speaker Embeddings with Convolutional Neural Network on Supervector for Text-Independent Speaker Recognition
Cai, Danwei
Cai, Zexin
Li, Ming
2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1478 - 1482
[10] A tutorial on text-independent speaker verification
Bimbot, F
Bonastre, JF
Fredouille, C
Gravier, G
Magrin-Chagnolleau, I
Meignier, S
Merlin, T
Ortega-García, J
Petrovska-Delacrétaz, D
Reynolds, DA
EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2004, 2004 (04) : 430 - 451

← 1 2 3 4 5 →