Effect of Nonlinear Compression Function on the Performance of the Speaker Identification System under Noisy Conditions

被引:6
作者
Jawarkar, Naresh P. [1 ]
Holambe, Raghunath S. [2 ]
Basu, Tapan Kumar [3 ]
机构
[1] Babasaheb Naik Coll Engn, Pusad, MS, India
[2] SGGS Inst Engn & Technol, Nanded, MS, India
[3] Acad Technol, Hooghly, WB, India
来源
PERCEPTION AND MACHINE INTELLIGENCE, 2015 | 2015年
关键词
Speaker identification; noisy environment; GMM; GFCC; MFCC; CEPSTRAL ANALYSIS; RECOGNITION; FEATURES;
D O I
10.1145/2708463.2709049
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The accurate speaker identification is difficult due to a number of factors. One of the most prominent factors is environmental noise. In this paper, the effect of two nonlinear compression functions, namely log and cubic root used in the feature extraction process, on the performance of the closed set text-independent speaker identification system under clean-and noisy-speaking environments is addressed. Performance is analyzed with Mel frequency cepstral coefficients (MFCC) and Gammatone frequency cepstral coefficients (GFCC). The Gaussian mixture model approach is used for speaker modeling. Two databases, namely, Marathi and Hindi databases were used for the experimentation. It has been observed that the cubic-root based features outperform the log based features under noisy conditions with SNR < 20 dB.
引用
收藏
页码:137 / 144
页数:8
相关论文
共 25 条