JOINT I-VECTOR WITH END-TO-END SYSTEM FOR SHORT DURATION TEXT-INDEPENDENT SPEAKER VERIFICATION

被引:0
|
作者
Huang, Zili [1 ]
Wang, Shuai [1 ]
Qian, Yanmin [1 ,2 ]
机构
[1] Shanghai Jiao Tong Univ, Brain Sci & Technol Res Ctr, Key Lab Shanghai Educ Commiss Intelligent Interac, Speech Lab,Dept Comp Sci & Engn, Shanghai, Peoples R China
[2] Tencent, Tencent AI Lab, Bellevue, WA 98004 USA
来源
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2018年
关键词
speaker verification; end-to-end; i-vector; triplet loss; hard trial selection; EMBEDDINGS;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Factor analysis based i-vector has been the state-of-the-art method for speaker verification. Recently, researchers propose to build DNN based end-to-end speaker verification systems and achieve comparable performance with i-vector. Since these two methods possess their own property and differ from each other significantly, we explore a framework to integrate these two paradigms together to utilize their complementarity. More specifically, in this paper we develop and compare four methodologies to integrate traditional i-vector into end-to-end systems, including score fusion, embeddings concatenation, transformed concatenation and joint learning. All these approaches achieve significant gains. Moreover, the hard trial selection is performed on the end-to-end architecture which further improves the performance. Experimental results on a text-independent short-duration dataset generated from SRE 2010 reveal that the newly proposed method reduces the EER by relative 31.0% and 28.2% compared to the i-vector and end-to-end baselines respectively.
引用
收藏
页码:4869 / 4873
页数:5
相关论文
共 50 条
  • [21] DeepWriterID: An End-to-End Online Text-Independent Writer Identification System
    Yang, Weixin
    Jin, Lianwen
    Liu, Manfei
    IEEE INTELLIGENT SYSTEMS, 2016, 31 (02) : 45 - 53
  • [22] Influence of task duration in text-independent speaker verification
    Fauve, Benoit
    Evans, Nicholas
    Pearson, Neil
    Bonastre, Jean-Francois
    Mason, John
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2728 - +
  • [23] End-to-End Text-Dependent Speaker Verification
    Heigold, Georg
    Moreno, Ignacio
    Bengio, Samy
    Shazeer, Noam
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5115 - 5119
  • [24] Local Variability Vector for Text-Independent Speaker Verification
    Chen, Liping
    Lee, Kong Aik
    Ma, Bin
    Guo, Wu
    Li, Haizhou
    Dai, Li Rong
    2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 54 - +
  • [25] Joint Training of Expanded End-to-end DNN for Text-dependent Speaker Verification
    Heo, Hee-soo
    Jung, Jee-weon
    Yang, Il-ho
    Yoon, Sung-hyun
    Yu, Ha-jin
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1532 - 1536
  • [26] Evaluation of the I-vector System for Text-dependent Speaker Verification
    Li, Lin
    Guo, Huiyang
    Shang, Fengyi
    Hong, Qingyang
    Liu, Kai
    PROCEEDINGS OF 2017 11TH IEEE INTERNATIONAL CONFERENCE ON ANTI-COUNTERFEITING, SECURITY, AND IDENTIFICATION (ASID), 2017, : 60 - 63
  • [27] Joint Speaker Verification and Antispoofing in the i-Vector Space
    Sizov, Aleksandr
    Khoury, Elie
    Kinnunen, Tomi
    Wu, Zhizheng
    Marcel, Sebastien
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2015, 10 (04) : 821 - 832
  • [28] Discriminative Neural Embedding Learning for Short-Duration Text-Independent Speaker Verification
    Wang, Shuai
    Huang, Zili
    Qian, Yanmin
    Yu, Kai
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (11) : 1686 - 1696
  • [29] Phonetically-Aware Coupled Network For Short Duration Text-independent Speaker Verification
    Zheng, Siqi
    Lei, Yun
    Suo, Hongbin
    INTERSPEECH 2020, 2020, : 926 - 930
  • [30] Text-independent speaker verification using Support Vector Machines
    Kharroubi, J
    Chollet, G
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 4017 - 4017