JOINT I-VECTOR WITH END-TO-END SYSTEM FOR SHORT DURATION TEXT-INDEPENDENT SPEAKER VERIFICATION

被引：0

作者：

Huang, Zili ^{[1
]}

Wang, Shuai ^{[1
]}

Qian, Yanmin ^{[1
,2
]}

机构：

[1] Shanghai Jiao Tong Univ, Brain Sci & Technol Res Ctr, Key Lab Shanghai Educ Commiss Intelligent Interac, Speech Lab,Dept Comp Sci & Engn, Shanghai, Peoples R China

[2] Tencent, Tencent AI Lab, Bellevue, WA 98004 USA

来源：

2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2018年

关键词：

speaker verification; end-to-end; i-vector; triplet loss; hard trial selection; EMBEDDINGS;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Factor analysis based i-vector has been the state-of-the-art method for speaker verification. Recently, researchers propose to build DNN based end-to-end speaker verification systems and achieve comparable performance with i-vector. Since these two methods possess their own property and differ from each other significantly, we explore a framework to integrate these two paradigms together to utilize their complementarity. More specifically, in this paper we develop and compare four methodologies to integrate traditional i-vector into end-to-end systems, including score fusion, embeddings concatenation, transformed concatenation and joint learning. All these approaches achieve significant gains. Moreover, the hard trial selection is performed on the end-to-end architecture which further improves the performance. Experimental results on a text-independent short-duration dataset generated from SRE 2010 reveal that the newly proposed method reduces the EER by relative 31.0% and 28.2% compared to the i-vector and end-to-end baselines respectively.

引用

页码：4869 / 4873

页数：5

共 50 条

[21] DeepWriterID: An End-to-End Online Text-Independent Writer Identification System
Yang, Weixin
Jin, Lianwen
Liu, Manfei
IEEE INTELLIGENT SYSTEMS, 2016, 31 (02) : 45 - 53
[22] Influence of task duration in text-independent speaker verification
Fauve, Benoit
Evans, Nicholas
Pearson, Neil
Bonastre, Jean-Francois
Mason, John
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2728 - +
[23] End-to-End Text-Dependent Speaker Verification
Heigold, Georg
Moreno, Ignacio
Bengio, Samy
Shazeer, Noam
2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5115 - 5119
[24] Local Variability Vector for Text-Independent Speaker Verification
Chen, Liping
Lee, Kong Aik
Ma, Bin
Guo, Wu
Li, Haizhou
Dai, Li Rong
2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 54 - +
[25] Joint Training of Expanded End-to-end DNN for Text-dependent Speaker Verification
Heo, Hee-soo
Jung, Jee-weon
Yang, Il-ho
Yoon, Sung-hyun
Yu, Ha-jin
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1532 - 1536
[26] Evaluation of the I-vector System for Text-dependent Speaker Verification
Li, Lin
Guo, Huiyang
Shang, Fengyi
Hong, Qingyang
Liu, Kai
PROCEEDINGS OF 2017 11TH IEEE INTERNATIONAL CONFERENCE ON ANTI-COUNTERFEITING, SECURITY, AND IDENTIFICATION (ASID), 2017, : 60 - 63
[27] Joint Speaker Verification and Antispoofing in the i-Vector Space
Sizov, Aleksandr
Khoury, Elie
Kinnunen, Tomi
Wu, Zhizheng
Marcel, Sebastien
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2015, 10 (04) : 821 - 832
[28] Discriminative Neural Embedding Learning for Short-Duration Text-Independent Speaker Verification
Wang, Shuai
Huang, Zili
Qian, Yanmin
Yu, Kai
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (11) : 1686 - 1696
[29] Phonetically-Aware Coupled Network For Short Duration Text-independent Speaker Verification
Zheng, Siqi
Lei, Yun
Suo, Hongbin
INTERSPEECH 2020, 2020, : 926 - 930
[30] Text-independent speaker verification using Support Vector Machines
Kharroubi, J
Chollet, G
2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 4017 - 4017

← 1 2 3 4 5 →