PARTIAL AUC OPTIMIZATION BASED DEEP SPEAKER EMBEDDINGS WITH CLASS-CENTER LEARNING FOR TEXT-INDEPENDENT SPEAKER VERIFICATION

被引:0
作者
Bai, Zhongxin [1 ,2 ]
Zhang, Xiao-Lei [1 ,2 ]
Chen, Jingdong [1 ,2 ]
机构
[1] Northwestern Polytech Univ, Ctr Intelligent Acoust & Immers Commun, Xian, Peoples R China
[2] Northwestern Polytech Univ, Sch Marine Sci & Technol, Xian, Peoples R China
来源
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2020年
基金
美国国家科学基金会; 以色列科学基金会;
关键词
speaker verification; pAUC optimization; speaker centers; verification loss; RECOGNITION;
D O I
10.1109/icassp40776.2020.9053674
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Deep embedding based text-independent speaker verification has demonstrated superior performance to traditional methods in many challenging scenarios. Its loss functions can be generally categorized into two classes, i.e., verification and identification. The verification loss functions match the pipeline of speaker verification, but their implementations are difficult. Thus, most state-of-the-art deep embedding methods use the identification loss functions with softmax output units or their variants. In this paper, we propose a verification loss function, named the maximization of partial area under the Receiver-operating-characteristic (ROC) curve (pAUC), for deep embedding based text-independent speaker verification. We also propose a class-center based training trial construction method to improve the training efficiency, which is critical for the proposed loss function to be comparable to the identification loss in performance. Experiments on the Speaker in the Wild (SITW) and NIST SRE 2016 datasets show that the proposed pAUC loss function is highly competitive with the state-of-the-art identification loss functions.
引用
收藏
页码:6819 / 6823
页数:5
相关论文
共 50 条
  • [31] Residual Factor Analysis for Text-independent Speaker Verification
    Zhu, Lei
    Zheng, Rong
    Xu, Bo
    PROCEEDINGS OF THE 2009 CHINESE CONFERENCE ON PATTERN RECOGNITION AND THE FIRST CJK JOINT WORKSHOP ON PATTERN RECOGNITION, VOLS 1 AND 2, 2009, : 964 - 968
  • [32] Speaker Verification by Partial AUC Optimization With Mahalanobis Distance Metric Learning
    Bai, Zhongxin
    Zhang, Xiao-Lei
    Chen, Jingdong
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 1533 - 1548
  • [33] CNN WITH PHONETIC ATTENTION FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
    Zhou, Tianyan
    Zhao, Yong
    Li, Jinyu
    Gong, Yifan
    Wu, Jian
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 718 - 725
  • [34] Influence of task duration in text-independent speaker verification
    Fauve, Benoit
    Evans, Nicholas
    Pearson, Neil
    Bonastre, Jean-Francois
    Mason, John
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2728 - +
  • [35] Score normalization for text-independent speaker verification systems
    Auckenthaler, R
    Carey, M
    Lloyd-Thomas, H
    DIGITAL SIGNAL PROCESSING, 2000, 10 (1-3) : 42 - 54
  • [36] A New Score Normalization for Text-Independent Speaker Verification
    Ning, Hongke
    Zou, Y. X.
    Hu, Xuyan
    2014 19TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2014, : 636 - 639
  • [37] The Catcher in the Field: A Fieldprint based Spoofing Detection for Text-Independent Speaker Verification
    Yan, Chen
    Long, Yan
    Ji, Xiaoyu
    Xu, Wenyuan
    PROCEEDINGS OF THE 2019 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY (CCS'19), 2019, : 1215 - 1229
  • [38] Text-independent speaker verification with dynamic trajectory model
    Xiang, B
    IEEE SIGNAL PROCESSING LETTERS, 2003, 10 (05) : 141 - 143
  • [39] Sequential Speaker Embedding and Transfer Learning for Text-Independent Speaker Identification
    Hong, Qian-Bei
    Wu, Chung-Hsien
    Su, Ming-Hsiang
    Wang, Hsin-Min
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 827 - 832
  • [40] Analysis-Based Optimization of Temporal Dynamic Convolutional Neural Network for Text-Independent Speaker Verification
    Kim, Seong-Hu
    Nam, Hyeonuk
    Park, Yong-Hwa
    IEEE ACCESS, 2023, 11 : 60646 - 60659