A NOVEL I-VECTOR FRAMEWORK USING MULTIPLE FEATURES AND PCA FOR SPEAKER RECOGNITION IN SHORT SPEECH CONDITION

被引：0

作者：

Zhang, Chi ^{[1
]}

Li, Xiaoqiang ^{[1
]}

Li, Wei ^{[2
,3
]}

Lu, Peizhong ^{[2
]}

Zhang, Wenqiang ^{[2
,3
]}

机构：

[1] Shanghai Univ, Sch Comp Engn & Sci, Shanghai, Peoples R China

[2] Fudan Univ, Sch Comp Sci & Technol, Shanghai, Peoples R China

[3] Fudan Univ, Shanghai Key Lab Intelligent Informat Proc, Shanghai, Peoples R China

来源：

PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP) | 2016年

关键词：

speaker recognition; short speech condition; PCA; i-vector; JOINT FACTOR-ANALYSIS;

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Speaker recognition in short speech condition is a difficult topic because the length of training and test speech is very short. One of the main disadvantage of the existing methods for speaker recognition is that they need very sufficient data and it's usually impossible in reality applications. In our experiments, the conventional methods with single feature don't make good performance in short speech. We propose a novel i-vector framework using multiple features and Principal Component Analysis (PCA) in short speech condition to overcome this difficulty, as multiple features combination can represent more aspects of a speaker. PCA is used to map the multiple features to an uncorrelated and orthogonal basis set to meet the requirements of Gaussian Mixture Model (GMM) with diagonal covariance matrices and i-vector. Improvement from the proposed approach compared to a state-of-the-art system are of roughly 50% relative at equal error rate when evaluated on the telephone conditions from the 2010 NIST speaker recognition evaluation (SRE).

引用

页码：499 / 503

页数：5

共 50 条

[41] ENTROPY ANALYSIS OF I-VECTOR FEATURE SPACES IN DURATION-SENSITIVE SPEAKER RECOGNITION
Nautsch, Andreas
Rathgeb, Christian
Saeidi, Rahim
Busch, Christoph
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4674 - 4678
[42] Applying Emotional Factor Analysis and I-Vector to Emotional Speaker Recognition
Chen, Li
Yang, Yingchun
BIOMETRIC RECOGNITION: CCBR 2011, 2011, 7098 : 174 - 179
[43] DISCRIMINATIVELY RE-TRAINED I-VECTOR EXTRACTOR FOR SPEAKER RECOGNITION
Novotny, Ondrej
Plchot, Oldrich
Glembek, Ondrej
Burget, Lukas
Matejka, Pavel
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6031 - 6035
[44] Nonlinear I-Vector Transformations for PLDA-Based Speaker Recognition
Cumani, Sandro
Laface, Pietro
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (04) : 908 - 919
[45] Speaker Weight Estimation from Speech Signals Using a Fusion of the i-vector and NFA Frameworks
Poorjam, Amir Hossein
Bahari, Mohamad Hasan
Van Hamme, Hugo
2015 INTERNATIONAL SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND SIGNAL PROCESSING (AISP), 2015, : 118 - 123
[46] Analysis of I-Vector framework for Speaker Identification in TV-shows
Fredouille, Corinne
Charlet, Delphine
15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 71 - 75
[47] Improving Short Utterance based I-vector Speaker Recognition using Source and Utterance-Duration Normalization Techniques
Kanagasundaram, A.
Dean, D.
Gonzalez-Dominguez, J.
Sridharan, S.
Ramos, D.
Gonzalez-Rodriguez, J.
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2464 - 2468
[48] DURATION MISMATCH COMPENSATION FOR I-VECTOR BASED SPEAKER RECOGNITION SYSTEMS
Hasan, Taufiq
Saeidi, Rahim
Hansen, John H. L.
van Leeuwen, David A.
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7663 - 7667
[49] Effect of multicondition training on i-vector PLDA configurations for speaker recognition
Rajan, Padmanabhan
Kinnunen, Tomi
Hautamaki, Ville
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3661 - 3664
[50] On the Complementary Role of DNN Multi-Level Enhancement for Noisy Robust Speaker Recognition in an I-Vector Framework
Zhang, Xingyu
Zou, Xia
Sun, Meng
Wu, Penglong
Wang, Yimin
He, Jun
IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2020, E103A (01) : 356 - 360

← 1 2 3 4 5 →