PARTIAL AUC OPTIMIZATION BASED DEEP SPEAKER EMBEDDINGS WITH CLASS-CENTER LEARNING FOR TEXT-INDEPENDENT SPEAKER VERIFICATION

被引：0

作者：

Bai, Zhongxin ^{[1
,2
]}

Zhang, Xiao-Lei ^{[1
,2
]}

Chen, Jingdong ^{[1
,2
]}

机构：

[1] Northwestern Polytech Univ, Ctr Intelligent Acoust & Immers Commun, Xian, Peoples R China

[2] Northwestern Polytech Univ, Sch Marine Sci & Technol, Xian, Peoples R China

来源：

2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2020年

基金：

美国国家科学基金会; 以色列科学基金会;

关键词：

speaker verification; pAUC optimization; speaker centers; verification loss; RECOGNITION;

D O I：

10.1109/icassp40776.2020.9053674

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Deep embedding based text-independent speaker verification has demonstrated superior performance to traditional methods in many challenging scenarios. Its loss functions can be generally categorized into two classes, i.e., verification and identification. The verification loss functions match the pipeline of speaker verification, but their implementations are difficult. Thus, most state-of-the-art deep embedding methods use the identification loss functions with softmax output units or their variants. In this paper, we propose a verification loss function, named the maximization of partial area under the Receiver-operating-characteristic (ROC) curve (pAUC), for deep embedding based text-independent speaker verification. We also propose a class-center based training trial construction method to improve the training efficiency, which is critical for the proposed loss function to be comparable to the identification loss in performance. Experiments on the Speaker in the Wild (SITW) and NIST SRE 2016 datasets show that the proposed pAUC loss function is highly competitive with the state-of-the-art identification loss functions.

引用

页码：6819 / 6823

页数：5

共 50 条

[31] Residual Factor Analysis for Text-independent Speaker Verification
Zhu, Lei
Zheng, Rong
Xu, Bo
PROCEEDINGS OF THE 2009 CHINESE CONFERENCE ON PATTERN RECOGNITION AND THE FIRST CJK JOINT WORKSHOP ON PATTERN RECOGNITION, VOLS 1 AND 2, 2009, : 964 - 968
[32] Speaker Verification by Partial AUC Optimization With Mahalanobis Distance Metric Learning
Bai, Zhongxin
Zhang, Xiao-Lei
Chen, Jingdong
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 1533 - 1548
[33] CNN WITH PHONETIC ATTENTION FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
Zhou, Tianyan
Zhao, Yong
Li, Jinyu
Gong, Yifan
Wu, Jian
2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 718 - 725
[34] Influence of task duration in text-independent speaker verification
Fauve, Benoit
Evans, Nicholas
Pearson, Neil
Bonastre, Jean-Francois
Mason, John
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2728 - +
[35] Score normalization for text-independent speaker verification systems
Auckenthaler, R
Carey, M
Lloyd-Thomas, H
DIGITAL SIGNAL PROCESSING, 2000, 10 (1-3) : 42 - 54
[36] A New Score Normalization for Text-Independent Speaker Verification
Ning, Hongke
Zou, Y. X.
Hu, Xuyan
2014 19TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2014, : 636 - 639
[37] The Catcher in the Field: A Fieldprint based Spoofing Detection for Text-Independent Speaker Verification
Yan, Chen
Long, Yan
Ji, Xiaoyu
Xu, Wenyuan
PROCEEDINGS OF THE 2019 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY (CCS'19), 2019, : 1215 - 1229
[38] Text-independent speaker verification with dynamic trajectory model
Xiang, B
IEEE SIGNAL PROCESSING LETTERS, 2003, 10 (05) : 141 - 143
[39] Sequential Speaker Embedding and Transfer Learning for Text-Independent Speaker Identification
Hong, Qian-Bei
Wu, Chung-Hsien
Su, Ming-Hsiang
Wang, Hsin-Min
2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 827 - 832
[40] Analysis-Based Optimization of Temporal Dynamic Convolutional Neural Network for Text-Independent Speaker Verification
Kim, Seong-Hu
Nam, Hyeonuk
Park, Yong-Hwa
IEEE ACCESS, 2023, 11 : 60646 - 60659

← 1 2 3 4 5 →