Multilingual Speaker Identification by Combining Evidence from LPR and Multitaper MFCC

被引:2
|
作者
Nagaraja, B. [1 ]
Jayanna, H. [1 ]
机构
[1] Siddaganga Inst Technol, Dept Informat Sci & Engn, Tumkur 572103, Karnataka, India
关键词
Speaker identification; mel-frequency cepstral coefficients; multitaper mel-frequency cepstral coefficients; multilingual; linear prediction residual; linear prediction residual phase;
D O I
10.1515/jisys-2013-0038
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, the significance of combining the evidence from multitaper mel-frequency cepstral coefficients (MFCC), linear prediction residual (LPR), and linear prediction residual phase (LPRP) features for multilingual speaker identification with the constraint of limited data condition is demonstrated. The LPR is derived from linear prediction analysis, and LPRP is obtained by dividing the LPR using its Hilbert envelope. The sine-weighted cepstrum estimators (SWCE) with six tapers are considered for multitaper MFCC feature extraction. The Gaussian mixture model-universal background model is used for modeling each speaker for different evidence. The evidence is then combined at scoring level to improve the performance. The monolingual, crosslingual, and multilingual speaker identification studies were conducted using 30 randomly selected speakers from the IITG multivariability speaker recognition database. The experimental results show that the combined evidence improves the performance by nearly 8-10% compared with individual evidence.
引用
收藏
页码:241 / 251
页数:11
相关论文
共 50 条