DISCRIMINANT LOCAL INFORMATION DISTANCE PRESERVING PROJECTION FOR TEXT-INDEPENDENT SPEAKER RECOGNITION

被引:0
作者
He, Liang [1 ]
Liu, Jia [1 ]
机构
[1] Tsinghua Univ, Tsinghua Natl Lab Informat Sci & Technol, Dept Elect Engn, Beijing 100084, Peoples R China
来源
2012 8TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING | 2012年
关键词
information geometry; total variability model; Fisher information; discriminant local preserving projection; text-independent speaker recognition; VERIFICATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A novel method is presented based on a statistical manifold for text-independent speaker recognition. After feature extraction, speaker recognition becomes a sequence classification problem. By discarding time information, the core task is the comparison of multiple sample sets. Each set is assumed to be governed by a probability density function (PDF). We estimate the PDFs and place the estimated statistical models on a statistical manifold. Fisher information distance is applied to compute distance between adjacent PDFs. Discriminant local preserving projection is used to push adjacent PDFs which belong to different classes apart to further improve the recognition accuracy. Experiments were carried out on the NIST SRE08 tel-tel database. Our presented method gave an excellent performance.
引用
收藏
页码:349 / 352
页数:4
相关论文
共 12 条
[1]  
Campbell W. M., 2009, ADV NEURAL INFORM PR
[2]   FINE: Fisher Information Nonparametric Embedding [J].
Carter, Kevin M. ;
Raich, Raviv ;
Finn, William G. ;
Hero, Alfred O., III .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2009, 31 (11) :2093-U195
[3]   Front-End Factor Analysis for Speaker Verification [J].
Dehak, Najim ;
Kenny, Patrick J. ;
Dehak, Reda ;
Dumouchel, Pierre ;
Ouellet, Pierre .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04) :788-798
[4]  
Hatch AO, 2006, INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, P1471
[5]  
He X., 2003, P ICCV
[6]   Face recognition using Laplacianfaces [J].
He, XF ;
Yan, SC ;
Hu, YX ;
Niyogi, P ;
Zhang, HJ .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2005, 27 (03) :328-340
[7]   Eigenvoice modeling with sparse training data [J].
Kenny, P ;
Boulianne, G ;
Dumouchel, P .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (03) :345-354
[8]   A study of interspeaker variability in speaker verification [J].
Kenny, Patrick ;
Ouellet, Pierre ;
Dehak, Najim ;
Gupta, Vishwa ;
Dumouchel, Pierre .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (05) :980-988
[9]   Face recognition using discriminant locality preserving projections based on maximum margin criterion [J].
Lu, Gui-Fu ;
Lin, Zhong ;
Jin, Zhong .
PATTERN RECOGNITION, 2010, 43 (10) :3572-3579
[10]   PCA versus LDA [J].
Martìnez, AM ;
Kak, AC .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2001, 23 (02) :228-233