Effect of Nonlinear Compression Function on the Performance of the Speaker Identification System under Noisy Conditions

被引：6

作者：

Jawarkar, Naresh P. ^{[1
]}

Holambe, Raghunath S. ^{[2
]}

Basu, Tapan Kumar ^{[3
]}

机构：

[1] Babasaheb Naik Coll Engn, Pusad, MS, India

[2] SGGS Inst Engn & Technol, Nanded, MS, India

[3] Acad Technol, Hooghly, WB, India

来源：

PERCEPTION AND MACHINE INTELLIGENCE, 2015 | 2015年

关键词：

Speaker identification; noisy environment; GMM; GFCC; MFCC; CEPSTRAL ANALYSIS; RECOGNITION; FEATURES;

D O I：

10.1145/2708463.2709049

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The accurate speaker identification is difficult due to a number of factors. One of the most prominent factors is environmental noise. In this paper, the effect of two nonlinear compression functions, namely log and cubic root used in the feature extraction process, on the performance of the closed set text-independent speaker identification system under clean-and noisy-speaking environments is addressed. Performance is analyzed with Mel frequency cepstral coefficients (MFCC) and Gammatone frequency cepstral coefficients (GFCC). The Gaussian mixture model approach is used for speaker modeling. Two databases, namely, Marathi and Hindi databases were used for the experimentation. It has been observed that the cubic-root based features outperform the log based features under noisy conditions with SNR < 20 dB.

引用

页码：137 / 144

页数：8

共 25 条

[1] ROOT CEPSTRAL ANALYSIS - A UNIFIED VIEW - APPLICATION TO SPEECH PROCESSING IN CAR NOISE ENVIRONMENTS [J].