A NOVEL I-VECTOR FRAMEWORK USING MULTIPLE FEATURES AND PCA FOR SPEAKER RECOGNITION IN SHORT SPEECH CONDITION

被引:0
|
作者
Zhang, Chi [1 ]
Li, Xiaoqiang [1 ]
Li, Wei [2 ,3 ]
Lu, Peizhong [2 ]
Zhang, Wenqiang [2 ,3 ]
机构
[1] Shanghai Univ, Sch Comp Engn & Sci, Shanghai, Peoples R China
[2] Fudan Univ, Sch Comp Sci & Technol, Shanghai, Peoples R China
[3] Fudan Univ, Shanghai Key Lab Intelligent Informat Proc, Shanghai, Peoples R China
来源
PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP) | 2016年
关键词
speaker recognition; short speech condition; PCA; i-vector; JOINT FACTOR-ANALYSIS;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Speaker recognition in short speech condition is a difficult topic because the length of training and test speech is very short. One of the main disadvantage of the existing methods for speaker recognition is that they need very sufficient data and it's usually impossible in reality applications. In our experiments, the conventional methods with single feature don't make good performance in short speech. We propose a novel i-vector framework using multiple features and Principal Component Analysis (PCA) in short speech condition to overcome this difficulty, as multiple features combination can represent more aspects of a speaker. PCA is used to map the multiple features to an uncorrelated and orthogonal basis set to meet the requirements of Gaussian Mixture Model (GMM) with diagonal covariance matrices and i-vector. Improvement from the proposed approach compared to a state-of-the-art system are of roughly 50% relative at equal error rate when evaluated on the telephone conditions from the 2010 NIST speaker recognition evaluation (SRE).
引用
收藏
页码:499 / 503
页数:5
相关论文
共 50 条
  • [31] Minimax i-vector extractor for short duration speaker verification
    Hautamaki, Ville
    Cheng, You-Chi
    Rajan, Padmanabhan
    Lee, Chin-Hui
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3675 - 3679
  • [32] Tied Variational Autoencoder Backends for i-Vector Speaker Recognition
    Villalba, Jesus
    Brummer, Niko
    Dehak, Najim
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1004 - 1008
  • [33] Discriminatively learned network for i-vector based speaker recognition
    Yao, Shengyu
    Zhou, Ruohua
    Zhang, Pengyuan
    Yan, Yonghong
    ELECTRONICS LETTERS, 2018, 54 (22) : 1302 - 1303
  • [34] DEALING WITH ADDITIVE NOISE IN SPEAKER RECOGNITION SYSTEMS BASED ON I-VECTOR APPROACH
    Matrouf, D.
    Ben Kheder, W.
    Bousquet, P-M.
    Ajili, M.
    Bonastre, J-F.
    2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 2092 - 2096
  • [35] Investigation of Segmentation in i-Vector Based Speaker Diarization of Telephone Speech
    Zajic, Zbynek
    Kunesova, Marie
    Radova, Vlasta
    SPEECH AND COMPUTER, 2016, 9811 : 411 - 418
  • [36] Speaker Adaptation Using i-Vector Based Clustering
    Kim, Minsoo
    Jang, Gil-Jin
    Kim, Ji-Hwan
    Lee, Minho
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2020, 14 (07): : 2785 - 2799
  • [37] Depression detection based on linear and nonlinear speech features in I-vector/SVDA framework
    Mobram, Shamim
    Vali, Mansour
    COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 149
  • [38] FAST APPROXIMATE I-VECTOR ESTIMATION USING PCA
    Omar, Mohamed Kamal
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4495 - 4499
  • [39] SPEAKER SEGMENTATION USING I-VECTOR IN MEETINGS DOMAIN
    Neri, Leonardo V.
    Pinheiro, Hector N. B.
    Ren, Tsang Ing
    Cavalcanti, George D. da C.
    Adami, Andre G.
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5455 - 5459
  • [40] Full multicondition training for robust i-vector based speaker recognition
    Ribas, Dayana
    Vincent, Emmanuel
    Ramon Calvo, Jose
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1057 - 1061