A NOVEL I-VECTOR FRAMEWORK USING MULTIPLE FEATURES AND PCA FOR SPEAKER RECOGNITION IN SHORT SPEECH CONDITION

被引:0
作者
Zhang, Chi [1 ]
Li, Xiaoqiang [1 ]
Li, Wei [2 ,3 ]
Lu, Peizhong [2 ]
Zhang, Wenqiang [2 ,3 ]
机构
[1] Shanghai Univ, Sch Comp Engn & Sci, Shanghai, Peoples R China
[2] Fudan Univ, Sch Comp Sci & Technol, Shanghai, Peoples R China
[3] Fudan Univ, Shanghai Key Lab Intelligent Informat Proc, Shanghai, Peoples R China
来源
PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP) | 2016年
关键词
speaker recognition; short speech condition; PCA; i-vector; JOINT FACTOR-ANALYSIS;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Speaker recognition in short speech condition is a difficult topic because the length of training and test speech is very short. One of the main disadvantage of the existing methods for speaker recognition is that they need very sufficient data and it's usually impossible in reality applications. In our experiments, the conventional methods with single feature don't make good performance in short speech. We propose a novel i-vector framework using multiple features and Principal Component Analysis (PCA) in short speech condition to overcome this difficulty, as multiple features combination can represent more aspects of a speaker. PCA is used to map the multiple features to an uncorrelated and orthogonal basis set to meet the requirements of Gaussian Mixture Model (GMM) with diagonal covariance matrices and i-vector. Improvement from the proposed approach compared to a state-of-the-art system are of roughly 50% relative at equal error rate when evaluated on the telephone conditions from the 2010 NIST speaker recognition evaluation (SRE).
引用
收藏
页码:499 / 503
页数:5
相关论文
共 50 条
  • [41] ENTROPY ANALYSIS OF I-VECTOR FEATURE SPACES IN DURATION-SENSITIVE SPEAKER RECOGNITION
    Nautsch, Andreas
    Rathgeb, Christian
    Saeidi, Rahim
    Busch, Christoph
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4674 - 4678
  • [42] Applying Emotional Factor Analysis and I-Vector to Emotional Speaker Recognition
    Chen, Li
    Yang, Yingchun
    BIOMETRIC RECOGNITION: CCBR 2011, 2011, 7098 : 174 - 179
  • [43] DISCRIMINATIVELY RE-TRAINED I-VECTOR EXTRACTOR FOR SPEAKER RECOGNITION
    Novotny, Ondrej
    Plchot, Oldrich
    Glembek, Ondrej
    Burget, Lukas
    Matejka, Pavel
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6031 - 6035
  • [44] Nonlinear I-Vector Transformations for PLDA-Based Speaker Recognition
    Cumani, Sandro
    Laface, Pietro
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (04) : 908 - 919
  • [45] Speaker Weight Estimation from Speech Signals Using a Fusion of the i-vector and NFA Frameworks
    Poorjam, Amir Hossein
    Bahari, Mohamad Hasan
    Van Hamme, Hugo
    2015 INTERNATIONAL SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND SIGNAL PROCESSING (AISP), 2015, : 118 - 123
  • [46] Analysis of I-Vector framework for Speaker Identification in TV-shows
    Fredouille, Corinne
    Charlet, Delphine
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 71 - 75
  • [47] Improving Short Utterance based I-vector Speaker Recognition using Source and Utterance-Duration Normalization Techniques
    Kanagasundaram, A.
    Dean, D.
    Gonzalez-Dominguez, J.
    Sridharan, S.
    Ramos, D.
    Gonzalez-Rodriguez, J.
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2464 - 2468
  • [48] DURATION MISMATCH COMPENSATION FOR I-VECTOR BASED SPEAKER RECOGNITION SYSTEMS
    Hasan, Taufiq
    Saeidi, Rahim
    Hansen, John H. L.
    van Leeuwen, David A.
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7663 - 7667
  • [49] Effect of multicondition training on i-vector PLDA configurations for speaker recognition
    Rajan, Padmanabhan
    Kinnunen, Tomi
    Hautamaki, Ville
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3661 - 3664
  • [50] On the Complementary Role of DNN Multi-Level Enhancement for Noisy Robust Speaker Recognition in an I-Vector Framework
    Zhang, Xingyu
    Zou, Xia
    Sun, Meng
    Wu, Penglong
    Wang, Yimin
    He, Jun
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2020, E103A (01) : 356 - 360